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Abstract 



Reductionism is a viable strategy for designing and implementing practical program- 
ming languages, leading to solutions which are easier to extend, experiment with 
and formally analyze. 

We formally specify and implement an extensible programming language, based 
on a minimalistic first-order imperative core language plus strong abstraction mech- 
anisms, reflection and self-modification features. The language can be extended 
to very high levels: by using Lisp-style macros and code-to-code transforms which 
automatically rewrite high-level expressions into core forms, we define closures and 
first-class continuations on top of the core. 

Non-self-modifying programs can be analyzed and formally reasoned upon, thanks 
to the language simple semantics. We formally develop a static analysis and prove 
a soundness property with respect to the dynamic semantics. 

We develop a parallel garbage collector suitable to multi-core machines to permit 
efficient execution of parallel programs. 

Keywords: programming, language, extensibility, macro, transformation, reflec- 
tion, bootstrap, interpretation, compilation, parallelism, concurrency, garbage col- 
lection 



Resume 

Le reductionnisme est une technique realiste de conception et implantation de vrais 
langages de programmation, et conduit a des solutions plus faciles a etendre, ex- 
perimenter et analyser. 

Nous specifions formellement et implantons un langage de programmation ex- 
tensible, base sur un langage-noyau minimaliste imperatif du premier ordre, equipe 
de mecanismes d'abstraction forts et avec des possibilites de reflexion et auto- 
modification, he langage pent etre etendu a des niveaux tres hauts : en utilisant des 
macros a la Lisp et des transformations de code d code reecrivant les expressions 
etendues en expressions-noyau, nous definissons les clotures et les continuations de 
premiere classe au dessus du noyau. 

Les programmes qui ne s'auto-modifient pas peuvent etre analyses formellement, 
grace a la simplicite de la semantique. Nous developpons formellement un exemple 
analyse statique et nous prouvons une propriete de soundness par apport a la 
semantique dynamique. 

Nous developpons un ramasse-miettes parallele qui convient aux machines multi- 
coeurs, pour permettre I'execution efficace de programmes paralleles. 

Mots-cles : programmation, langage, extensibilite, macro, transformation, re- 
flection, bootstrap, interpretation, compilation, parallelisme, concurrence, ramasse- 
miettes 



A large, crowded maze of a building that is just one part of one 
branch of the local administration, in the Paris neighborhood. Under 
the Summer heat I've been standing there or somewhere very close since 
the early morning, awake since before 6am just for the privilege of being 
near the front of the line. It's finally my turn, after half a day spent 
waiting. And now she tells me that no, my avis d'imposition fiscal is 
not a valid justificatif de domicile. And who cares if they had told me 
the opposite in her very office: she has no intention of listening to my 
complaints. I'll have to return another day, with a signed copy of an 
identity document of my homeowner. 

After I get back to the main hall near the entrance to arrange that 
next appointment I must look as irritated as I am. The woman at the 
desk asks me what happened. When I repeat to her what I've been 
told just a few minutes before, she explodes. — What!? Come with me. 
Shouting that she'll be right back to the people waiting behind me, she 
abandons her place and angrily storms away to another office. I follow 
her. 

We sit. Between an half- muttered insult to her colleague and the next 
she asks me for my papers one by one, and checks each of them. She has 
to come back to her own work: fueled by adrenaline she's thorough, but 
efficient. Last, I hand her my avis d'imposition fiscal. She compares the 
address, looks at the date, and skims the rest. — Yes, it's perfectly fine! 
That — labeling her colleague by a final one-word definition. She signs 
my dossier herself, overriding or simply ignoring the other's authority. 
I'll have to go pay the tax at the cash register, yes, right there on the 
left and yes, then I'm done. I barely have the time to thank her before 
she runs back to her desk. 



To that blondish, forty-something woman who was working in a government build- 
ing near Paris during the Summer of 2010, whatever her name is, I dedicate this 
work. 

May her rage inspire others to do the right thing. 

Luca Saiu, December 2012 
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Chapter 1 

Introduction 



Reductionism is a viable strategy for designing and implementing practical program- 
ming languages, leading to solutions which are easier to extend, experiment with 



and formally analyze. 
Contents 

1.1 Programming language taxonomy 1 

1.2 Hybridization and complexity 4 

1.3 Growing a language 5 

1.4 Our solution 14 

1.5 Summary 15 



Programming languages have proliferated nearly since the beginning of Com- 
puter Science. However, despite the sheer number of dialects with different syntaxes 
and details, there is still comparatively little variety in programming models and 
paradigms — yet programming problems remain at least as hard as ever. 

In order to really innovate in this field researchers need extensible languages 
which are easy to modify and experiment with, but at the same time not limited to 
simplified idealizations. Bringing the same idea out of the lab and into practice, an 
expert end-user should be able to bend and adapt the language to make it fit the 
problem, rather than the opposite. 

For this to be possible a language has to start out simple and open-ended: able 
to express different paradigms, yet not hardwired for any; easy to reason about in 
a rigorous way when needed, without being unconditionally constrained. 

1.1 Programming language taxonomy 

Languages may be classified along at least three mostly orthogonal axes: paradigm, 
typing policy and concurrency model. In the following we limit ourselves to a quick 
overview; reviews articles such as [92] contain a much more detailed topology, with 
extensive examples. 

Furthermore, not all of the relevant concepts have satisfactory formal definitions; 
but in this whirlwind tour we are going to renounce most pretenses of being exact, 
accepting to speak of very general concepts in terms somewhat vague. 
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Chapter 1. Introduction 



1.1.1 Paradigm 

Many popular languages such as C are imperative. Imperative languages, based on 
destructive mutation of state and explicit control flow, trace their origin to Turing 
machines, and in practice are easy to understand in terms of the underlying machine 

languages. 

Functional languages such as Haskell, ML and Lisp, shunning or at least limiting 
the occurrences of assignment statements, are basically sugared versions of some 
variant of the A-calculus; their level of abstraction is much farther away from the 
hardware than imperative languages. Most functional languages are higher-order, 
i.e. they allow to pass functions as parameters to other functions, and to return 
functions as results. 

At an even higher level, relational and constraint programming attempt to sup- 
port a declarative, rather than algorithmic, style by dealing with sequences of data in 
an extensional fashion and having the user exploit data relations instead of building 
explicit data structures. Such languages tend to be based on particularly clean and 
simple mathematical theories such as relational algebra (the SQL query language) 
or some subset of the Predicate Calculus (Prolog^). 

Object-oriented languages such as Smalltalk are more pragmatic: they encourage 
modelling data structures upon real-world entities by making the "behavior" or a 
computational object a function of the object identity, and making it easy to define 
related classes of objects by only specifying their differences. 

Other families including concatenative languages such as Forth and Postscript, 
and array languages like APL can be more or less directly traced back to one of the 
four main groups. 

1.1.2 Typing policy 

Another orthogonal attribute of programming languages is their support for typing: 
programs written in statically-typed languages are mechanically analyzed prior to 
execution in order to check that some soundness property is satisfied, thus prevent- 
ing certain errors from ever occurring at runtime: the compiler will simply reject any 
"suspicious" program — invariably including some false positives. ML and Haskell 
are examples of statically typed languages with strong type systems; many popular 
languages such as C, C++ and Java are also statically-typed, but their very com- 
plex semantics do not allow the extensive static checks which are relatively easy to 

'^Despite not being meant as general-purpose languages, we argue that database query lan- 
guages are actually much better examples of declarative non-algorithmic programming than logic 
languages: query languages allow to reason about objects and their relations, completely abstract- 
ing from data structures and even more importantly search strategies, i.e. algorithms. By contrast 
programming in Prolog in practice requires to constantly keep in mind its operational semantics, 
for reasons of efficiency and even correctness: for example just reordering two Horn clauses, which 
from a logic point of view simply yields an uninteresting equivalent variation, can easily change 
complexity from linear to exponential, dramatically alter termination properties and the number 
of results. 
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perform in functional languages^, and many more runtime errors remain possible: 
we speak of weak static type checking. 

By contrast dynamically-typed languages such as Lisp, Perl and Python perform 
checks at runtime before executing each operation subject to failure, typically at 
some cost in performance but gaining expressivity in comparison to the static-typing 
case; of course dynamic typing by itself cannot statically guarantee any soundness 
property. 

Some low-level languages such as Forth and most assemblies are untyped: each 
datum is interpreted as-is without any check or conversion, assuming that is always 
a valid operand for its operator; such languages trading safety for efficiency may not 
be suitable for all applications, yet they also definitely have a place in programming. 

A subset of statically and strongly-typed languages including ML and Haskell em- 
ploy type inference to automatically reconstruct type declarations from programs, 
rather than have the programmer provide them; this is convenient, but since type 
inference is undecidable for the most powerful type systems [18], relying on it alone 
reduces the expressivity of the language. Type inference is harder to employ in 
non-functional languages, and possibly because of cultural biases it is not widely 
used with weak type systems. 

Even if mainstream languages seem resistant to adopt any such technique, the 
idea of static checking can be extended from typing to other properties computable 
via (necessarily partial for nontrivial languages) static analyses such as termination, 
time and space complexity, or escaping. 

1.1.3 Concurrency model 

Another aspect which we must at least cite constitutes another whole axis in the lan- 
guage space topology: the model of concurrency (synchronous versus asynchronous, 
message-passing versus shared state) also has a deep impact on the language seman- 
tics, and not only on the implementation of the runtime system. 

The concurrency model of all the mainstream languages named above is asyn- 
chronous shared-state: concurrent "threads" read and mutate the same global state, 
explicitly synchronizing accesses when needed. The other important concurrency 
model is message-passing: threads or processes don't share state, but cooperate by 
exchanging messages. Erlang supports message-passing only; in the other languages 
mentioned above message-passing can be implemented on top of shared state, or is 
available as a thin "wrapper" over the inter-process communication primitives pro- 
vided by the operating system. Languages with a synchronous concurrency level 
(at the software level) are a research topic [9, 17, 67] but have yet to see major 
application. 

^It could be argued that the idea of passing parameters to and receiving results from a function 
lends itself to reasoning about compatibility; sequential side effects, on the other hand, always 
"compose well" with one another in a superficial sense, but may lead to more subtle violations of 
implied invariants. 
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Synchronous models are better suited to formal reasoning: in his formal calculi for 
concurrency CCS [56] and vr-calculus [53, 54] Milner considered synchronous com- 
munication as primitive, and represented asynchronous processes by adding (syn- 
chronous) "queue processes". 

On the other hand modern parallel hardware is strongly asynchronous, and truly 
parallel synchronous implementations on top of it tend to be prohibitively inefficient. 

Older-generation languages tend to support no notion of concurrency at all. 

1.2 Hybridization and complexity 

Why are there so many programming languages? Couldn't they just agree on one? 
We have all heard this nai've question. 

The fact that such a question can only come from a beginner is evident from 
our experience of how coding in even surprisingly similar languages "feels" different: 
for example about^ the only real semantic difference between Pascal and C is the 
different strength of the type system - static in either case; yet the subjective "ex- 
periences" of writing in the two languages are far apart, as any programmer having 
used both can witness. That said, we also have to recognize that many differences 
between languages are in fact incidental, due to backward compatibility concerns or 
cultural inertia. 

Another, deeper, answer to the beginner question is that different problems call 
for different languages. But then, why not merging the greatest possible number of 
features from different styles into one "perfect" language? In fact there exists such a 
trend: languages inspire and influence one another, and some recent ones such as Oz 
[93] even make a point of offering support for as many different paradigms as possible; 
but even without looking at such extreme examples, a move towards hybridization is 
evident in most recent languages: contemporary languages such as C++, Java and 
C^, but also the popular "scripting languages" Python, Perl, JavaScript, incorpo- 
rate at least two paradigms (imperative and object-oriented, with some elements of 
functional programming being more slowly accepted into the mainstream); in fact 
it could be argued that object-oriented programming is itself firmly rooted in the 
imperative paradigm, with only some restricted patterns taken from functional lan- 
guages^. Most object-oriented languages also have hybrid type systems, with some 

^Here we can ignore the difference in approach in working around the funarg problem [61]. 

*The idea of late binding at the heart of object-orientation would be easy to emulate with data 
structures containing functions; in fact virtual method tables are typically implemented as chained 
arrays of pointers to functions, where functions have access to a "struct" holding field values: in 
other words, chained closure arrays: if method lookup fails in one class, a link is followed and the 
next one is tried, up the inheritance chain — or sometimes sideways, in case of multiple inheritance. 

As a different kind of hybridization, other non-imperative languages now include object-oriented 
features: the ML dialect OCaml [71, 19] managed to also add objects in a mostly-functional 
language and still keep its type system strong and static, at the cost of some complexity. More 
pragmatically, most modern SQL systems include some kind of object-oriented extension, more or 
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static and some dynamic checks. Often both message-passing and shared-state are 
available as concurrency models. 

1.2.1 Hybridization limits 

There are clear limits to hybridization: some features regarded as desirable in dif- 
ferent communities are mutually incompatible, if not opposite. For example having 
a static typing discipline and not having one may be reasonably argued to be use- 
ful features: one solution permits to prove run-time properties of a program before 
running it, the other improves expressivity. In the same spirit, the useful properties 
of purely functional languages [10, 40] would be destroyed by adding an assignment 
operator. 

But even if we forget for a moment that many possible sets of features are just 
incompatible, designing a strongly hybrid language entails giving up on finding sim- 
ple answers to programming problems, and just hoping that programmers will be 
better prepared for the unknown with a bigger toolbox: the bigger, the better. This 
pragmatic approach hits its limit when a language becomes too big to be intellec- 
tually manageable — at which stage the language may or may not be adequate for 
most tasks. 

As a further objection against hybridization, working with such large chimeras 
makes harder to experiment with language features by building prototypes — in 
fact it may not be by chance that most such experimentation has historically taken 
place in Lisp dialects, which we will see to be closest to our model. 

1.3 Growing a language 

Guy L. Steele dealt with the issue of the "size" of programming languages in his 
famous OOPSLA 1998 keynote talk "Growing a language" [85]. In a wonderfully 
deconstructionist exploit, Steele constrained his own English to follow the same 
rigid rules of formal languages in which every non-primitive "word" needs to be ex- 
plicitly defined before use. By taking as primitives only English monosyllables he 
tried to communicate the feel of working with a very small (programming) language. 

The main point of the speech was the idea of working with a language powerful 
enough to be evolved by the user community under the coordination of a main- 
tainer; and possibly even more important, the user herself would bend the language 
to her needs, as part of the daily practice of programming. 

Even after several polysyllable definitions Steele's original prose retains its pe- 
culiar charm: 

f...] a language design of the old school is a pattern for programs. But 
now we need to 'go meta. ' We should now think of a language design as 
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a pattern for language designs, a tool for making more tools of the same 
kind. [...] My point is that a good programmer in these times does not 
just write programs. A good programmer builds a working vocabulary. In 
other words, a good programmer does language design, though not from 
scratch, but by building on the frame of a base language. 

— Guy L. Steele Jr., [85] 

As the initial iteration of this process Steele proposed Java with some minor changes 
— thus at least a middle-sized language; in his opinion intentionally small languages 
such as the Lisp dialect Scheme, originally his own brainchild [89, 87], would remain 
hopelessly inadequate for modern tasks, as he tried to suggest with the one-syllable 
metaphor. 

More than a decade has since passed, and the envisaged extension of Java by the 
community has not materialized^. 

The idea of "growing a language" remains a valid strategy, if not even the only 
realistic one. Without overlooking this important engineering insight, we find it 
worth to spend some words on what we do not agree with in Steele's presentation. 
Anyway, since much of the controversy will center around Java, to Steele's credit we 
must at least cite his later contributions based on Fortress [86], sharing the same 
idea of a "growth plan" but with a more suitable core language. That is the point: 
what makes Fortress a better match than Java for the task? And what yet superior 
alternative can we extrapolate from this trend? 

1.3.1 Procedural and syntactic abstraction 

In our opinion it is not by chance that the crucial insight for finding the missing 
ingredient was provided in [2], co-authored by Gerald J. Sussman — the other father 
of Scheme. 

Our sketch of a language topology is reasonable but fails to really capture the 
actual expressive power of a language, as until this moment we have ignored a 
fundamental and orthogonal class of language features: means of abstraction, called 
"patterns" in Steele's quotation above. 

Starting from the very first chapter, [2] speaks about means of abstraction as ways 
of naming patterns of code, possibly with parameters, so that they may be re-used 
at will as if they were primitive. In other words a mean of abstraction allows to 
factor away some code, so that we can reason at a higher level and ignore irrelevant 
details unless needed. The idea is by necessity vague as it potentially extends to 
any point of our language topology and beyond, yet we do not feel much danger of 
ambiguity: any programmer will promptly recognize abstraction features. We are 
speaking about procedural abstractions (including all the obvious generalizations 

^Ironically, something close to Steele's vision of mostly-decentralized extensions has material- 
ized in Scheme with SRFIs [80]; development is active, at least in the relatively small scale of the 
Scheme community. 
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to functions, predicates, modules and classes); and, as a separate group, syntactic 
abstractions such as macros. No other example of syntactic abstraction feature 
comes to mind and in fact no other case is in common use^ — but we are going to 
introduce another kind in §1.3.3, and more fully in §5. Modern high-level languages 
support more or less adequate procedural abstractions, including higher order (C++ 
supports "lambdas" as per its latest Standard [38], and even Java should follow 
suit), but most are still very weak in syntactic abstraction; we are now proceeding 
to explain why this is important. 

1.3.2 Syntactic abstraction and core-based languages: macros 

In order to clarify what we mean by syntactic abstraction, we are now going to 
informally present a classic example. 

Let us assume an imperative language similar to Pascal or C, with a while.. do 
..done loop but no repeat. .until; we want to extend the language so that we can 
write: 



procedure print_at_least_once (n) : 
variable x = 1; 
repeat 

print ( "x is " ) ; 
print (x) ; 
newline () ; 
X : = X + 1 ; 
until X > n; 
end. 



With a suitable macro system we could define repeat.. until as syntactic sugar, 
so that for any sequence of statements s and any expression e, the loop "repeat 
s until e" is rewritten into "s; while not (e) do s; done". Handwaving away 
some trivial details which are not relevant here, we could say that the macro defini- 
tion is: 



macro repeat <s> until <e>: 
<s>; 

while not (<e>) do 

<s>; 
done ; 
end. 



Using the repeat.. until macro, the macroexpander stage of the compiler would 
automatically rewrite the above definition of print_at_least_once into: 

®Some historical Lisp dialects used fexprs [64] as an alternative to macros; so does a new dialect 
called Kernel [78], resurrecting them thirty years later. Fexprs are also discussed at some length 
as an implementation device in [87, pp. 25-26]. 
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procedure print_at_least_once (n) : 
variable x = 1; 
print ( "x is ") ; 
print (x) ; 
newline () ; 
x := X + 1; 
while not (x > n) do 

print ( "x is " ) ; 

print (x) ; 

newline () ; 

X : = X + 1 ; 
done ; 
end. 

Unsurprisingly enough, the rewritten code contains a repetition: the macro call 
"factors away" an undesired regularity which would make the code harder to maintain 
if written directly in the extended form; even in such a trivial example as this the 
code using the macro looks easier to read, and its purpose more explicit. Also notice 
how macroexpansion had only local effect: the macro call has been replaced, but 
not its surrounding code. 

It is worth stressing how repeat.. until loop support can not be defined as a 
procedure, unless the language supports higher order or some more exotic language 
feature such as passing statements as parameters; anyway, even if those features 
were available, notice how macroexpansion might still produce a more efficient re- 
sult and possibly be safer, as it takes place entirely before runtime. 
Our sample macro simply glues together pieces of code without performing any 
substantial computation. This is not necessarily the case: several languages, mostly 
Lisp dialects, have Turing- complete macro systems [79, 4, 47]. Other languages such 
as C are limited to weak token-based preprocessors [37] whose power does not reach 
much further than defining repeat. .until, with uglier syntax. Other languages, 
Java included, do not support syntactic abstraction at all. We argue that precisely 
this weakness of Java has prevented Steele's plan from materializing. 
Indeed the bottom-up programming style in which the language is extended to suit 
the problem, as implicit in Steele's quotation, is very typical in the Lisp world — 
and quite alien to most other communities. 

Macros are so helpful in building syntactic sugar for existing languages that some 
Lisp dialects such as Scheme are in fact core-hased [79, §1.9, §B]: implementations 
may choose to develop at a low level some core forms and define the rest of the lan- 
guage as a set of macros, eventually rewriting programs into combinations of core 
forms only. 

As an obvious such example, a block in a higher-order functional language can 
always be rewritten into the immediate application of an anonymous function with 
the bound variables as its formals and the block body as its body, and with the 
bound expressions as actual parameters: "let a=l + 2, b = 3 + 4ina + b" 
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has the same operational"^ behavior as " (A a b . a + b) (1 + 2) (3 + 4)"; since 
the language needs A and function application anyway, we can define let as a macro 
rather than having it as a primitive form, obtaining a simpler core language. 
As the number of bound variables is arbitrary, the sample macro language we showed 
above can not express the rewrite quite adequately; yet the task is considered very 
ordinary in Lisp dialects, which to ease metaprogramming use the same syntax for 
programs and data structures. Deferring the explanation of details to §5, we show 
here a possible definition of let just to highlight its simplicity^: 



(def ine-macro (let bindings . body) 
'((lambda , (map car bindings) 
,(§body) 
,@(map cadr bindings))) 



This version of let binds variables "in parallel": defined expressions have no visibility 
of bound variables. The alternative "sequential-binding" block known as let* in Lisp 
is also easy to obtain by macroexpansion, in this case by rewriting it into nested 
trivial parallel-binding blocks in which the outermost let binds the first variable 
bound by let*. Recursive macros can still be very simple: 



(def ine-macro (let* bindings . body) 
(if (null? bindings) 
'(let ,®body) 

'(let ((,(caar bindings) , (cadar bindings))) 
(let* , (cdr bindings) 
,(§body)))) 



let follows the intended evaluation order under a call-hy-value strategy: fully re- 
ducing all the operands before the application forces to evaluate the body only after 
all bound expressions. Much in the same way, let* constrains the evaluation order 
so that bound expressions are reduced top-to-bottom. 

There are many other similar examples: the short-circuiting left-to-right version 
of the and and or operators are easy to express with a conditional: "a and b" has 
the same behavior as "if a then b else false", while "a or b" can be rewritten 
into "if a then true else b". 



^Despite the vagueness entailed by not having specified any particular semantics we have to at 
least recognize the fact that what is equivalent operationally might not be under a corresponding 
static semantics; in particular we might have good reasons for using different type rules in the 
case of the expanded form. Such an observation is not new at all: Milner had already recognized 
in [55, §3.5] what is now called let-polymorphism [63, §22. 7[. Here we intentionally disregard any 
static semantics, delaying the justification to §4. This same remark also applies to the following 
examples in this subsection. 

^We have omitted the check verifying that bindings are two elements long. Apart from that, 
the definition is perfectly realistic. 
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We can go even further: why having a conditional at all? By Church-encoding 
booleans so that "true" is "Axy.x" and "false" is "Axy.y", and taking "•" as the el- 
ement of a unit type (or indeed as any object), we can define "if a then b else c" 
as syntactic sugar for "( (a (Az.b) (Xz.c)) •)", for some variable z not occurring 
free in b and c. Again, A and the function application with • allow the conditional 
to reduce as expected also under call-by-value. 

1.3.3 Transforms as syntactic abstraction 

The continuation of a subexpression of a program is a function or procedure with 
side effects which, given the result of the subexpression, returns the final result of 
the program [89, 5]. 

Programs can be automatically rewritten into Continuation-Passing Style, a nor- 
mal form making all continuations explicit as A-terms. For example, let the con- 
tinuation of a + 2 be k; then by using one of the transformations in [46] we obtain 
{Xx.{Xy.K{x + y))2)a as its CPS version. It is not too difficult to see how both 
versions yield the same result: 

• First we evaluate a, passing it to its continuation Xx.{[Xy.K{x + y))2); 

• the continuation of a binds it to x, and passes 2 to its continuation 
Xy.{K{x + y)); 

• the continuation of 2 binds it to y, evaluates the sum of x and y, and passes 
the result on to k; 

• K provides the sum of x and y to the rest of the computation, which will finally 
yield the result. 

One desirable feature of the CPS form is its independence from the evaluation strat- 
egy: in particular, the reduction sequence above holds for both call-by-value and 
call-by-name. Because of this and other useful properties, CPS may be convenient to 
use in compilers as an intermediate form to perform some semantic-preserving opti- 
mizations [83, 5, 44]. However, since our main interest here is program expressivity, 
we are particularly interested in Brst-class continuations: a syntactic form such as 
call/cc permits to access continuations as program data, performing "jumps" into 
or out from expressions [79]. The call/cc form is also rewritten into an ordinary 
A-term along with the rest, by the same transformation which turns the program 
into CPS. 

First-class continuations are famously counter-intuitive and difficult to employ di- 
rectly, but they can simulate powerful control features such as exceptions, coroutines, 
generators, and even backtracking; by using macros we can syntactically abstract 
away the implementation of these control features, and provide a simple high-level 
syntax. 
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A prerequisite for doing this is, of course, hiding the transformed program from 
the user. As ah'eady evident from the trivial example above, CPS-transformed 
programs are long and tedious to read: the user should simply use call/cc (or, 
better, syntactic forms reducing to call/cc uses) in direct-style, untransformed 
programs. We already stated that the transformation can be automatic, and that 
it supports call/cc as well — hence such a kind of syntactic abstraction is clearly 
possible. But can we define a CPS transformation using macros? 
The answer is no: the result of CPS-transforming an expression depends on its 
context, while macros only have access to their parameters. If we are to support 
global program rewritings such as CPS, we need to introduce a second syntactic 
abstraction feature: we call it the transform^ . Transforms map syntactic objects 
into other syntactic objects, and can be employed either before defining a global 
object, or retroactively on an existing program. 

Transforms can be quite useful [46]: just as it is the case for CPS, the Closure 
Conversion process rewriting A-terms into explicit closures [72, 6] may require con- 
textual information: unless we want to close over globals^*^, when building a closure 
we need to know the set of variables which are bound at that program point. 

To reiterate, by composing two transforms, it is possible to build a language 
supporting first-class continuations based on a core not even containing anonymous 
X-terms. 

Program transformations are commonly seen in formal mathematical presentations, 
but to our knowledge they have not been available as an abstraction tool in any 
general-purpose programming language up to this point. 

1.3.4 Why reductionism 

It should be clear by now that taking the core-based approach to its logical extreme 
yields a very simple core language. Anyway before committing to that route it may 
be worth to pause and consider our ultimate purpose, and of course the tradeoffs 
involved. 

First of all a small core language is easy to reason about, particularly if program 
analysis is automated — which had better be, at a time when programs as written 
by humans can be millions of lines long. Moreover the core language will tend to 

®Some controversy remains among English purists about the use of the term "transform" versus 
"transformation"; according to some, a transform is the result of a transformation. We admit 
that even in Computer Science the use of the term "transform" for the function operating the 
transformation is not universal, but we still prefer the shorter form. 

^°ML dialects do in fact close over "global" variables as well, but we find that this choice 
complicates interactive programming, sometimes yielding "unexpected" results in case of variable 
redefinitions. As a matter of personal taste we tend to prefer the solution of Common Lisp and 
Scheme which close over nonlocals only, not including globals. In practice we suppose that the ML 
behavior is dictated by the need not to invalidate the results of previous type inferences whenever 
a global is redefined. 
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be easier to "get riglit", as a small number of features will reduce the chances of 
unforeseen bad interactions. A naive implementation will also be quick to build. 

On the other hand, as the core language will not be usable in practice with- 
out several layers of extensions, the problem of tracking source locations becomes 
relevant: a user will normally write in the extended language and expect the pro- 
gramming system to refer the extended program as she wrote it in terms of file 
names, line numbers and syntactic forms: for example error messages referring the 
final transformed program would prove very hard to understand^^. Just a little 
more subtly, static analyses will often need to refer to the non-primitive forms of 
some higher-level intermediate languages, instead of the core^^: hiding details is the 
whole point of abstractions. The obvious implementation will also be inefficient, as 
evident from the examples in §1.3.2 and §1.3.3; however, since the code will start 
small and manageable, it will be less hard to introduce optimizations where needed. 

We have mentioned pros and cons; in fact we argue that all the objections above 
can be answered, but even this it not essential in our view: we believe that any or 
all of the problems above would still be offset by the crucial advantage of obtaining 
an open-ended language, able to grow in unanticipated directions. 

In order to achieve this ultimate goal we need both a small core language, and 
strong syntactic abstractions. 

The choice above is a conscious restriction on the region of the design space we 
are setting to explore. Other choices are certainly possible, and a couple have been 
tackled in the past, with interesting results. 

1.3.5 Related languages 

The most famous example of this design as well as a source of inspiration for this 
work, again, is Scheme [79]: anyway we have to remark how the core contains much 
more than a functional language^'^, including complex features such as first-class 
continuations, which could not be re-implemented by using Scheme's syntactic ab- 
straction alone. Despite its beauty Scheme's syntactic abstraction system itself is 

^^Actually, simple partial solutions to this problem have been known for a long time. For 
example the C language preprocessor generates a #line directive in the output mentioning the 
original file name and line whenever the source of the output changes; this is enough information 
for the compiler to map each element of the single stream of code it receives back to its original 
location. Anyway debugging C code using macros remains notoriously hard: one reason is the lack 
of a similar output-to-input mapping at the level of syntactic forms. 

^"^ Again, an example is let under Hindley-Milner type inference, which could be defined with 
a macro such as the one above, but would benefit from being typed with let-polymorphism, dif- 
ferently from generic function calls. As another example, a CPS transform can encode first-class 
continuations into anonymous functions — but in a static type system, continuations would need 
their own separate typing rules. 

^^Tom Lord's failed proposal for the new Scheme Standard — before his alleged expulsion 
from Working Group 1 — would have been closer to our vision. Lord's core language "WGO 
Scheme" would have used fexprs [64, 78] and reified environments. His own recount is at http: 
/ /lambda- the -ultimate . org/node/3861#comment- 57967. 
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also quite complex, based as it is on hygienic macros [42]. The macro system does 
not look easy to factorize, as shown by the experience of psyntax: psyntax is a 
compiler translating Scheme with macros into pure Scheme, which is itself written 
in Scheme with macros and hence needs to be bootstrapped with a pre-compiled 
version. Psyntax is elegant but, by the author's admission [25, §18], not trivial. 

As our second example of a reductionistic design. Forth [30, 29] is about as ob- 
vious: its basic mechanism is imperative state mutation involving a fixed set of 
global stacks, with no need for actual expressions: "42" is an imperative statement 
pushing a number on a stack, and the "+" word replaces the two topmost stack ele- 
ments with their sum. All words, predefined or not, are zero-parameter zero-result 
procedures with imperative stack effects. Even control features such as loops are 
defined as stack operations, involving the stack normally used for procedure return 
addresses. 

Defining a word involves temporarily switching to a "compile state", in which 
each encountered word is taken from the program input stream and appended to 
the current definition rather than being executed immediately; to implement this, 
all word definitions begin and end with the state-switching words " : " and " ; ". The 
words "(" and ")", respectively opening and closing a comment, work in the same 
way. No syntactic structure exists at a scale larger than individual words. 

Forth is an unusual language somewhat defying classification, and it is debatable 
whether abstraction features such as changing "state" even count as syntactic — it 
might be the case that very strong procedural abstraction may partially compensate 
for the absence of syntactic abstraction features, like higher order does in functional 
programming^^. Syntactic of not. Forth abstraction features have proved to be ef- 
fective and they do allow to abstract to a high level, despite often starting from 
the bare metal with no operating system. Due to its simplicity, like Lisp, Forth has 
been independently re-implemented many times, by building some core primitive 
words in assembly and then writing the rest of the system in itself. The language is 
so small that hardware implementations exist [43, 34], and its key proponents such 
as Chuck Moore exhibit a cultural tendency to reject all conventional software as 
bloated and hopelessly overcomplicated [60]. 

Finally some classic object-oriented systems such as Smalltalk are fairly minimal- 
istic as well, and provide relatively strong procedural abstraction. Their example 
suggests reflection as a further strategy to help build complex programs: in object- 
oriented systems the state of each computational entity in the running program 
is available to the program itself, which can query objects for their interfaces at 

^^Our favorite example is with-mutex (for example as in [21, §Mutexes and Condition Variables]) 
in a shared-state concurrent language with exceptions: with-mutex executes some given statements 
in a critical section so that a mutex is acquired at entry and released at exit, including the case m 
which an exception causes a jump out of the critical section. It is easy to define with-mutex as a 
macro, and where macros are not available it can still be simulated using a higher-order procedure, 
at some loss of elegance. Without either of these features, the user is forced to duplicate code. 
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run time. Sucli runtime type information also provides the foundation of dynamic 
method dispatching, but we find this style of late binding to be less interesting for 
our purposes, serving better as a practical modelling tool than as a foundation for 
a core language. 

1.4 Our solution 

What kind of language do we want? In such a vast design space there is no clear 
answer, and committing a decision appears dangerous. Even with no breakthrough 
in sight at this particular time, our topology of §1.1 might even get enriched by 
entirely new dimensions in the future. 

Of course we do have opinions about what kind of language would be better 
to solve the software crisis, and we will not even try too hard to hide our personal 
preferences as the description unfolds; but opinions are not science. Lacking a silver 
bullet, the best course of action seems to leave our design open-ended and follow 
in software the lesson of RISC, eschewing any particular focus or specialization in 
exchange for wider applicability. 

In order to achieve Steele's vision in [85], our language: 

• will be built on a very small core, like Forth, but in a form more amenable to 
formal reasoning; 

• will provide strong syntactic abstraction features, like Scheme does, plus trans- 
forms, in the interest of expressivity; 

• will provide reflection, like object-oriented systems; 

• will not depend on either static or dynamic type checks in the core; such 
systems can be added as extensions; 

• is meant to be practical, and efficiently implement able. 

We call our language "e", following the convention of naming small variables in 
Mathematical Analysis. When written in the Latin alphabet as "epsilon", the initial 
"e" in its name should always be lowercase. 

A personality is a language made of the e core plus extensions, in analogy with 
research operating systems implementing several different APIs on top of the same 
microkernel [96]. Personalities may reach very far from the core, as our transform 
examples above show. 

We anticipate the development of complex^^ and widely diverging personalities, 
viewing the emergence of incompatible dialects, so feared in some communities, 
rather as a sign of health. 

^^The composition of extensions is a difficult problem, for which no general solution is apparent: 
see S5.5. 
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It is unfortunate but very possible that in a setting where a strongly extensible 
language is adopted, a new separation of programmers and meta-programmers (as 
personality developers) will follow the existing divide between programmers and 
language implementors; we do not claim to be able to cover such a cultural gap, but 
we mean to substantially ease the work of the second group. 

The Programming Language discipline needs more experimentation and proto- 
typing. Let people play, and the language will grow. 

1.5 Summary 

Programming languages have traditionally been diverse in paradigm, typing policy 
and concurrency model. The current trend of hybridization makes languages more 
expressive, but also much harder to reason about; moreover most current languages 
remain difficult to extend and lacking in syntactic abstraction features. 

In keeping with the philosophy of Scheme but bringing it much further, we 
propose the new programming language £ as an example of an alternative style 
of language definition in which strong abstraction capabilities allow the end user 
to express the needed linguistic features as translations into the extremely simple 
core language which is easy to reason about. A personality is a library of language 
extensions, in fact defining a new language in e itself. 

We argue in favor of language experimentation, recognizing dialect proliferation 
as beneficial in the long term. 



Chapter 2 

The core language £q 



In this chapter we are going to formally describe the core language Eq by giving 
a small-step operational semantics for it and stating under which conditions an 
implementation is bound to behave according to the semantics. 

As the foundation of much of the rest of this work, the specification will be used 
in §3 for describing the language reflective features, in §4 to prove correct a static 
analysis, and in §5 for defining syntactic extension semantics. 
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2.1 Features and rationale 

Our core language eo must be easy to reason about and efficiently compilable, has 
to include reflective features providing access to the program itself, and allow for 
parallelism. On the other hand the language does not need to be especially friendly 
to human users, since programmers will normally access it by means of higher-level 
syntactic extensions. 

Satisfying such a set of requirements yields an idiosyncratic language whose ex- 
treme simplicity risks being overlooked at a first glance, obscured by some slightly 
unusual design choices. 

Before formally specifying Eq^s syntax and semantics, it is worth to clarify the ra- 
tionale of some design decisions. 

2.1.1 First order 

eo is a first-order call-by-value expression-based imperative language with mutually- 
recursive procedures accepting zero or more parameters and returning zero or more 
results, where procedures are globally defined in a flat namespace. 
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The language is Erst-order: no anonymous procedures exist, and procedures can 
not be passed as parameters or returned as results^. 

Variable references are trivial to resolve: a block form is provided for binding 
local variables, which take precedence over procedure parameters, which in their turn 
take precedence over global variables. Since variables bound in other procedures are 
never accessible no other scoping rule is necessary. 

Expressions return values and are allowed to have side effects; no looping form 
exists, and since recursion is only permitted at the top level among global proce- 
dures no explicit fixpoint operator is needed. This sets apart £o from most functional 
languages, as the language of Eq expressions not referring global procedures is not 
Turing-complete. 

The language exhibits relatively low-level features, making it easy to write a sim- 
ple compiler with a clear efficiency model for non-self-modifying programs; Control 
Flow Analysis is also trivial, since all callees are explicitly identified by name at call 
sites. No escape mechanism such as exceptions, longjmp or ffist-class continuations 
is provided at this level, so evaluation strictly follows an intuitive stack discipline; in 
fact Eq is stack-implementable, and after macroexpansion and transforms have run, 
the residual Eq program does not necessarily require garbage collection. 

2.1.2 Reflection 

The current set of procedures is part of the global state of Eq, and procedure def- 
initions are accessible to the program for both reading and writing; this allows a 
program to analyze and modify itself. 

Compilation in Eq consists in examining the current global state in terms of data 
and procedures, and producing as output a low-level program which, when executed, 
will reproduce the current state, in a style reminiscent of some Smalltalk systems 
and the Emacs unexec hack [47, §Building Emacs]. 

An Eq compiler can be an ordinary set of Eq definitions, running on top of the 
interpreter; in this sense we can say that the compiler, if any, is part of the program 
being compiled, rather than an external tool; and in the same way the user is free 
to build other meta-level tools such as code analyzers, transformers or optimizers. 

2.1.3 Handles 

Since a program has to reason about itself, Eq needs some mechanism for unam- 
biguously referring to program points, also distinguishing different occurrences of 
otherwise identical syntactic forms. For this reason each syntactic form contains a 

^We are going to relax this restriction in the implementation for efficiency's sake; anyway, 
as shown in §5.4.1.2, it will be trivial to automatically transform any program using "procedure 
pointers" into an equivalent first-order program. Closures will be remarkably more involved to 
define (see §5.4.4.4). 



2.1. Features and rationale 



19 



unique identifier which we call handle, the only requirement being that each expres- 
sion of a program have a different handle. 

At the implementation level it is reasonable to think of handles as unboxed integers 
or pointers to unique objects, but the specific nature of handles as a data type is 
immaterial: in practice the only relevant feature of a handle is its identity. 

Handles are contained in expressions at all nesting levels, so that subexpressions 
at any depth may be referred by global names. 

It is easy to associate information to handles, typically using global tables. 

2.1.4 Primitives 

The language specification should be complemented with a set of "predefined" prim- 
itive operators and data types for such operators to act upon, integer arithmetics 
being the obvious example; other useful primitives include memory allocation and 
side effects, and input /output. Primitives may accept parameters, return results 
and affect the global state but, the rationale being analyz ability, they may not al- 
ter program control; this prevents "jumping" operations of the kind of exceptions, 
longjmp [37] and call/cc [79] from being implemented as primitives. 

In the following we will not assume any particular set of primitives, limiting 
ourselves to some reasonable constraints which the particular primitives have to 
satisfy. 

We do not dwell further on the specification of primitives which are to be imple- 
mented at low level, in practice using C or assembly language; a formal semantics 
of such low-level definitions is outside the scope of the present work. When work- 
ing with any particular Sq or e program, we will always take the set of available 
primitives as fixed. 

2.1.5 Bundles 

We allow Eq procedures and expressions to return any number of results, including 
zero; such as decision being more a concession to efficiency [8, 16], than an attempt 
of restoring symmetry between input and output. 

A bundle is an ordered sequence of values which may be the result of a computa- 
tion. The only feature distinguishing a bundle from an ordinary n-uple or list is the 
fact that bundles are not treated as data structures and in particular are expressible 
but not denotable: an Eq variable can only refer to one object, even if an expression 
is allowed to return a bundle of any size; the rationale being, of course, that no 
bundle data structures need to be expensively allocated and destroyed at runtime: 
each separate bundle element will be simply assigned a stack cell or register, possibly 
not even consecutively: no single "value" represents the whole bundle. 

In this sense bundles bear resemblance to the the Common Lisp and Scheme 
multiple values feature [4, 79], with the important difference that in Eq callers do 
not ignore all results except the first one by default. 
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In order to work on bundle components, eo's block form binds up to as many 
variables as the dimension of the bundle that its bound expression evaluates to; 
hence eq blocks also serve the purpose of "destructuring" bundles, which in practice 
simply means locally naming their components. 

For example if a "quotient-remainder" primitive returned both the quotient 
and the remainder of two naturals (and indeed many hardware architectures provide 
such a machine instruction) a block could compute the quotient and remainder of 
some parameters, name the results respectively x and y, and evaluate a body in an 
environment where such variables are visible. It is worth to stress that at runtime 
this naming does not entail any moving, copying or - worse - memory allocation. 

It is useful in practice not to always name all the components of a bundle, in 
particular for using nested blocks to simulate a statement sequence where the results 
of the intermediate steps are irrelevant; more in general, often one wants to ignore 
the result of a subexpression. 

Bundles do comphcate somewhat expression composition and are a possible cause 
of errors, but the performance gain they offer seems hard to obtain automatically 
by compiler optimization only. Recursive procedures returning more than one result 
seem a quite compelling example. 

Of course personality implementors aiming for very simple extensions are always 
free not to use bundles, or for that matter any other eq feature. 

2.1.6 Parallel features 

The parallel features of eq appear mundane compared to some of the points above, 
limited as they are to creating futures associated to asynchronous threads, and 
extracting the result of a given future when waiting for its computation to terminate. 

The system lends itself to both shared memory and message-passing, depending 
on primitives. In complex personalities aiming at high efficiency on large parallel 
machines or clusters, it is reasonable to expect that both styles will be used at dif- 
ferent levels. 

Again, parallel features introduce some complications into Eq but are too "funda- 
mental" to be left out and then meaningfully reintroduced as language extensions. 

2.2 Syntax 

We are now ready to formally specify the syntax of eo expressions, and establish 
some terminology about subexpressions. 

Let the set of variables X, the set of procedure names F, the set of primitive names 
01, the set of handles IH and the set of thread identifiers J be any numerable sets. 
By convention we will use the following metavariables, possibly with decorations, 
to represent objects of the respective sets: x e X for variables, / e F for procedure 
names, tt e fl for primitive names, /i e IH for handles, and t e T for thread identifiers. 
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Since the actual nature of "values" is irrelevant for the purposes of this chapter 
but has some other deep ramifications, we postpone its discussion until §3.3.1; as 
of now we simply speak of a set of values C, using the metavariable c e C for 
representing its elements. 

For our examples in this chapter it will suffice to just use natural numbers, 
writing "AA(n)" to represent n e N, booleans b e {#t,#f} written as "B{b)" pointers 
or memory addresses a written as "^(a)" and thread identifiers or futures t written 
as "Tit)". 

Definition 2.1 (sq syntax) We define an Eq expression according to the following 

grammar: 

e :: = 

Xh 
I Ch 

I [let X* be e in e\h 

I [call / e*]h 

I [primitive vr e*]fi 

I [if e e {c*} then e else e]h 

I [fork / e*]h 

I [join e]h 

I [bundle e*]h 

We call each separate production right-hand side an expression form or form. 

We call the first two cases a variable and a literal constant, respectively. A let 
block contains zero or more distinct bound variables, a bound expression, and a 
body. A procedure call call expression mentions a procedure name and zero or 
more actual parameters. Very similarly, a primitive call mentions a primitive name, 
and zero or more actual parameters. The conditional form if comprises a discrim- 
inand expression, zero or more conditional cases, and finally the then branch and 
else branch expressions. A fork expression has the same syntax as a procedure call, 
while a join expression simply contains one future expression. A bundle expression 
contains zero or more bundle items. 

Each expression and its subexpressions, at all levels, contain unique handles. 

We define E to be the set of all eo expressions; we use the metavariable e, possibly 
with decorations, to represent its elements. □ 

The grammar in Definition 2.1 should be hardly surprising at this point, except 
possibly for the shape of the conditional and fork expressions. 

The conditional expression shape is actually another small concession to efficiency: 
operationally, the discriminand expression is evaluated and compared to all the 
given conditional cases: if the discriminand evaluates to one of the given constants 
then the conditional reduces to the then branch, otherwise it reduces to the else 
branch. In many cases this kind of expression, when nested, is easy to optimize into 
multi-way conditional branches implemented as jump tables or balanced comparison 
trees. 
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Of course an Eq if expression can also simulate an ordinary two-way McCarthy 
conditional, by using a boolean literal as its only conditional case. Writing the 
false literal as #f in the style of Scheme, we can simulate the two-way conditional 
"(e e',e")" by [if e e {B{#f)} then e" else e']hQ, for some handle ho. Of course 
reasonable personalities will define their own friendlier conditionals. 

As for the fork expression, at a first look the form given above might appear gratu- 
itously complicated compared to an alternative containing only one "asynchronous 
expression". Anyway such an alternative would be difficult to evaluate, as the asyn- 
chronous expression could then refer variables bound in the original thread, which 
would effectively become nonlocals. For this reason, as elsewhere in Eq, we chose a 
more constrained syntax without too much fear of inconveniencing the user: per- 
sonalities will provide higher-level fork operators. 

Facilities for defining procedures will be dealt with in §3. 
2.2.1 Meta-syntactic conventions for expressions 

Since every syntactic object contains a handle, independently of the syntactic cate- 
gory or the specific case, when identifying particular components of a syntactic form 
instance we may explicitly specify a (meta-)handle in a sii6-expression of a given 
expression, despite referring to the sub-expression itself just with a meta-variable; 
for example /12 represents the handle of the join future expression in [join e/jjlhn 
regardless of the future expression specific "shape". We will also omit subscripts 
in meta-variables which already contain (meta-)handles unambiguously identifying 
instances: for example we will simply write [primitive + eh^ e/i2]/io instead of the 
heavier [primitive + ei^-^ e2/i2]/io- When only one identifier appears as a subscript, 
it should always be interpreted as a handle rather than a metavariable decoration. 

We will usually name (meta-)handle indices in a top-to-bottom left-to-right order 
according to the expression syntax; we may let indices start from either or 1, 
according to which option provides more notational convenience, such as avoiding 
the occasional "+ 1" in subscripts. For example we prefer writing an n-element 
multiple expression as [bundle ehj^...eh„]ho rather than as [bundle eh2...eh^_^_j^hi- 
Since we use handles only to identify occurrences of syntactic forms, their actual 
value is always immaterial: starting indices in meta-handle sequences can be just as 
arbitrary. 

2.3 Semantics and the real world 

Before finally specifying eo's semantics it is worth to add one last remark to prevent 
some misunderstandings, explicitly delimiting the cases in which an implementation 
is compelled to respect the semantics. This point is crucial if we are to speak 
about actual programs running in actual machines, rather than just another formal 
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calculus whose terms unfold into other terms in a Platonic universe where memory 
is infinite and checking for error conditions is for free. 

As eo is the underlying common language all personalities ultimately reduce to, 
it also represents an efficiency upper bound: a program written in a higher-level 
personality will only run as fast as its translation into Eq; hence the critical need 
for speed, also offsetting the cost of making some implementations unfriendly and 
unforgiving of mistakes. But thankfully an unforgiving implementation does not 
need to be the only one, and when developing an application a user will benefit 
from the feedback of a slower interpreter failing in a more descriptive way than a 
"Segmentation fault" message, and possibly allowing some form of debugging. 

The nature itself of failure needs to be carefully stated here: what we refer to as 
"failure" (§2.5.3) or "resource overflow" (§2.3.1) in the semantics does not necessarily 
translate into a dynamic check at the implementation level: 

Implementation Note 2.2 (implementation guarantees) A conforming imple- 
mentation will behave according to the semantics provided that the execution never 
reaches an error configuration and never exceeds any resource limit; otherwise, the 
implementation behavior is unspecified. □ 

Not failing and not overflowing resources is just a sufficient condition for an im- 
plementation to respect the semantics; in the implementation a program violating 
one of these condition is allowed to crash or silently return any result, possibly even 
the correct one: no guarantees at all. If a personality implementor wants to specify 
some behavior in one of these cases then it's her responsibility to perform static 
checks on the input code or to include dynamic checks in the generated Eq code, in 
order to prevent the conditions for unspecified behavior from occurring. 

It may be worth to state explicitly that, as a trivial consequence of Implemen- 
tation Note 2.2, an execution consuming an unbounded quantity of any resource 
has unspecified implementation behavior. 

2.3.1 Resource limits 

As it is easy to imagine, our first mention of numerable sets in §2.2 already hid a 
caveat: only a finite number of distinct values will be representable in an implemen- 
tation, due to the finite nature of address spaces and word sizes. 

There exist other remarkable cases of resource limits: for example the amount of 
available virtual memory, often dramatically smaller than what the address space al- 
lows for; operating systems also usually constrain the number of concurrent threads; 
an implementation might limit the stack size to a constant value. All of these cases 
will be covered by Implementation Note 2.7. 

In the following we will state which resources may be limited by implementations 
in explicit Implementation Notes such as the following one; as already explained in 
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Implementation Note 2.2, an implementation is not forced to detect the failure at 
runtime, and may proceed with undefined behavior in case of resource overflow. 

Implementation Note 2.3 (Syntactic resource limits) Each instance of the fol- 
lowing items occupies some memory in an implementation; an implementation will 
pose a limit to the sum total of all the used memory at any given time, possibly 
multiplied by a logarithmic factor. 

• variable and procedure names, the cost being proportional to the name length; 

• the number of handles; 

• expression syntactic complexity. □ 

In Implementation Notes dealing with resource limits such as Implementation Note 2.3 
above, we deliberately ignore constant terms: for example if the physical resource 
occupation of n items of a certain kind is n • o + A; units, an implementation may 
simply declare each item to take a units and the total resource availability to be k 
units lower than its actual dimension. 

Moreover, since avoiding resource overflows is only a sufficient condition for an im- 
plementation to respect the stated semantics, we allow Implementation Notes to 
describe resource occupation as an upper bound. 

At the cost of sounding pedantic, we stress that the statement above does not 
limit our reasoning about resources to asymptotic approximations; in fact, where 
a given implementation instantiates the precise costs and resource availability, it is 
possible to reason about whether a program "fits" the implementation on which it 
runs — the rationale of course being that a program overfiowing resources is not 
any better than an incorrect one. 

2.4 Configurations 

We are now ready to formally define the mathematical structures used in eg's dy- 
namic semantics. 

2.4.1 The global state 

A global state or simply state, always represented with a possibly decorated T 
metavariable, represents the instantaneous condition of an execution; an execution 
may access, reading and also destructively mutating parts of a state. We call "F" 
the set of all possible states. 

A state is a inherently composite object made of several state environments; by 
"environment" we simply mean a mathematical function mapping keys into values. 
We do not list all state environments here, since the need of some of them will not be 
apparent until later. In order to lighten our notation and to allow for yet-unspecified 
components without depending on some arbitrary order, we also avoid traditional 
projection and update operators, opting instead for a notation referring components 
by name, as if the state were a single "record" of environments. 
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2.4.1.1 Notational conventions for states and environments 

It is often convenient to exploit tlie set-of-pairs nature of relations to represent en- 
vironments in an extensional style, as in "{xi i— > qi,...,Xn > Qn}"] an interesting 
particular case is the empty environment, which is to say a nowhere-defined func- 
tion, which we write as the empty set "0". 

If is the state environment named n in F then we may write r„ to mean "i?; and 
of course we also employ the ordinary notation for function application by writing, 
for example, r„(x) = q oi equivalently : x ^ g. 

As per the standard update notation, we write "i?[x ^ g]" to represent an 
environment equal to everywhere on its domain except on x, which the updated 
environment maps to q. 

Extending the standard notation, we will also deal with updated environments 
in the state: in other words we build a state identical to a given one save for 
one environment, which has been updated in its turn; we will write "r[^ ^ '']" to 
represent the updated state identical to T except for the environment named n, 
which will be r,i[x h-> instead of Tn- 

We also write r[^] to represent a state identical to the state T except for the 
state environment named n, entirely replaced by the environment i9. 

Our use of brackets for updated states is distinct from the usual environment update 
notation, which we also adopt: we write to mean an environment identical to 

J] everywhere except on the domain of ^, where instead it is identical to ^. 

Notice that, unless we are dealing with mefa-labels such as "n" here in §2.4.1.1, 
we always write state environment labels in typewriter font: this makes it clear that 
we are establishing a label for some state environment at its first mention, without 
the need of detailing every time how one label represents the similarly-named state 
environment, when the association is always obvious from the context anyway. 



2.4.2 Global and local environments 

The global environment is a state environment mapping global variable names into 
values, and can be thought of as a partial function X ^ C. 

The global environment keeps track of globally-visible objects {glohals or non- 
procedures) , which are always accessible by a variable name unless shadowed by 
a procedure parameter or a local variable, which instead are bound in local en- 
vironments: local environments, also X — ^ C functions, take precedence over the 
global environment, and or course they are not state environments; we use the p 
metavariable for local environments, possibly with decorations. 

For example, when evaluated in a state r[gj^^^"^g'^^^^^j.Qjj^gjj^] and the local envi- 
ronment 0, the variable x in the expression Xh^ will refer the value A/'(42); but if 
instead the local environment was {x >—>■ AA(IO)}, then Af(10) would take precedence 
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over the global value in x. 

2.4.3 Memory 

eo expressions are allowed to perform imperative operations on mutable data struc- 
tures: in particular expressions may read or update cells of memory buffers, which 
can be allocated and destroyed. 

Such operations rely on the memory state environment A ^ C* as mapping ad- 
dresses into mutable word sequences or buffers; we might occasionally refer to each 
buffer element as a memory cell. 

It is important to notice that the memory state environment models the heap in 
the implementation, of which each cell makes up one word: here we are dealing 
with cells which can be allocated and destroyed with any strategy, rather than a 
simple LIFO policy. 

No data structures such as conses, tuples and arrays are hardwired in Eq, but 
memory makes it easy to define such objects in a personality. The fact that 
dynamically-created structures are "made of" memory entails their mutability in 
a natural way. Immutability, if one chooses to enforce it for some class of data in a 
high-level personality, can be realized with dynamic or static checks which prevent 
updating^ — but in Eq all memory cells are freely mutable, so as not to restrict the 
user in any way. 

At this point the reader may already be suspecting that the global environment 
could be used for simulating memory; while — assuming the availability of certain 
primitives — that intuition is correct, §3.2.2 will provide a strong argument in favor 
of having a separate memory state environment. 

2.4.4 Procedures 

A state also keeps track of the current set of procedure definitions. 

The procedure state environment is a F ^ (X* x E) partial function mapping 
each procedure name into a pair holding its zero or more formal parameters and 
the procedure body; for example, if a procedure named / has formals xi...Xn and 
body in the state T we write 'Tpj-ocedures : / ^ ({a;i...x„), eh)"; we may also omit 
angle brackets when no ambiguity can arise, in this case writing 'Tprocedures : / ^ 
{xi...Xn,eh)". 

We remind the reader that, since Eq is a first-order language and all procedures 
are global, no nonlocals can exist, hence there is no need for closure environments 
at this level. Procedure definitions will be dealt with later, in §3. 

Up to this point we have dealt with the syntax of Eq expressions only, stating that 



more radical strategy could involve a syntactic "extension" un-defining or otherwise making 
inaccessible the operators needed for the update. 
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global mutually-recursive procedures are also somehow available but without speci- 
fying any way of defining them. Because of interactions with the rest of the system 
the issue turns out to be more delicate than one could imagine, and we defer its full 
treatment to §3; what we can hint at now is that procedures can be defined with 
primitives, as explained below. 

2.4.5 Primitives 

We call primitives a set of low-level routines accessible from expressions, used 
for computation, program reflection or side effects. Primitives range from simple 
arithmetic operations such as + to reflection and procedure definition operations, 
potentially also involving destructive state updates. 

In an implementation primitives are routines implemented in a low-level language 
such as C or directly in Assembly. This does not mean however that a primitive is 
allowed to "do anything": primitives must not disrupt the program control flow by 
performing jumps or non-local exits or reentries d la longjmp or call/cc: primi- 
tives may affect the global state but have to behave in a procedural fashion, always 
giving control back to their caller; primitive behavior can actually be modeled by 
partial functions taking a fixed number of parameters and returning a fixed num- 
ber of results — including an input and output state. Such higher-order functional 
specification is a consequence of the fact that primitives, unlike procedures, are not 
directly implemented in £q and hence lack high-level bodies or any treatable "source". 

A primitive function with in-dimension n and out-dimension m {n,m e N and 
71, m ^ 0) is a partial function (C" x F) ^ (C" x F), mapping an n-uple of values 
and a state into an m-uple of values and another state; a primitive is a triple com- 
prising the primitive function, its in-dimension and out-dimension, which respects 
Axiom 2.10. We call P the set of all primitives, with P i= Un meiKii^' I P ^ 
(C" X F) — (C™ X F)}. 

The primitive environment state environment Fl ^ P maps each primitive name 
into a primitive. 

Axiom 2.10, defined in §2.5.3 and only needed for technical reasons, will just 
affirm that primitive success and failure are mutually exclusive. 

From now on we will bend our notation a little further by writing "rprimitives(7r) 
(ci, c„, r) = (c'l, c'„, r'>" or "rpriniitives(7r) ■# n ^ m" as needed, to avoid use- 
less pedantries such as "ji((ci, c„>, T) = {(c[, ...,c'^},r'} where rprimitives(7r) = 
{p, n, m)". 

As an example, considering the quotient -remainder primitive of §2.1.5 in some 
state r we could write "rprimitives(quotient-reniainder)(AA(13), A/'(3), T) = 
{AA(4), AA(1), r)" meaning that the quotient and remainder of the naturals 13 and 
3 are (respectively) the naturals 4 and 1, and that the primitive does not affect the 
global state; we could also write "Fprimitives (quotient-remainder) :^ 2 — > 2", by 



28 



Chapter 2. The core language eq 



which we would mean that quotient-remainder has two parameters and two re- 
sults — which does not prevent the primitive function from being partial, as indeed 
it is. Where the particular state is obvious from the context or irrelevant, we even 
write "vr n ^ m" to mean rprimitives(7r) n ^ m, for the appropriate T. 

As a further and possibly more interesting case, we just hint at the fact that 
memory operations such as allocating buffers and loading and storing words are 
performed by appropriate primitives: this will let us keep the semantics simple, 
ignoring the details of memory, and treating memory operations as just another 
instance of effects on the global state. 

Specifying a complete set of "default" primitives is out of the scope of this work, but 
§5 will informally introduce most primitives currently used in the implementation, 
while §3 will deal with reflection and program-updating in relation with primitives. 

We may informally speak of applicables, when abstracting away the distinction 
between procedures and primitives. 

2.4.6 Holed expressions 

In our dynamic semantics we need to capture intermediate computation snapshots 
in which an expression is in the middle of being evaluated. 

We define below an extended Eq expression grammar, where the hole "□" stands 
for a subexpression which is yet to be fully evaluated. 

Definition 2.4 (eq syntax) We define the set En of possibly-holed expressions or 
"Eq expressions" by the following grammar: 
en ■■■■ = 

e 

I [let X* be □ in e]h 

I [call / □];, 

I [primitive vr Dj/i 

I [if □ e {c*} then e else e]h 

I [bundle Dj^ 

I [fork / n]h 

I [join 

Syntactic cases are respectively named: non-holed expression, holed block or holed 
let, holed call or holed procedure call, holed primitive or holed primitive call, holed 
conditional, holed bundle, holed fork and holed join. 

All cases save the first represent properly holed expressions. □ 

Notice that holes do not occur in all possible expression contexts: this issue is related 
to tail contexts. 

Moreover, as the nonterminal en never occurs in a production right-hand side, 
no holed expression can contain other properly holed expressions: this "single hole" 
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property reflects eg's deterministic sequential evaluation strategy. 

2.4.7 Stacks 

Rather than resorting to the traditional small-step semantics style [99, §2.6] in which 
the computed parts of an expression are replaced with values, here we adopt a more 
realistic and lower-level model using explicit stacks and keeping track of "return 
points"; this should already be clear at this point from §2.4.6. 

We keep two separate aligned stacks per thread for describing evaluation, one stack 
representing the dynamic nesting of partially-evaluated expression forms and the 
other representing the dynamic nesting of values; we respectively call them the 
main stack or even simply the stack, and the value stack: 

• The main stack is a sequence of pairs, each pair containing a holed expression 
and its associated local environment (§2.4.2): the set of all possible main stacks 
is S = (En X (X — C))*; 

• The value stack is a sequence of objects, each of which being one of a value, 
the value separator 'T, or the activation separator Value stacks belong to 

A two-stack solution is particularly appropriate because of bundles and is visually 
intuitive, but of course efficient implementations for conventional machines will rea- 
sonably use a single stack per thread. 

We write stacks horizontally, with the top on the left: this is analogous to list 
syntax in Lisp and functional languages, and the opposite of Forth conventions. 

We usually represent main stacks with the metavariables S and value stack with 
the metavariables V, possibly decorated. 

2.4.8 Futures 

As we have already hinted at in §2.4.7 and will be made more clear in §2.5, evalua- 
tion in £q needs two stacks per thread, along with the global state. 

The "main" thread of a computation is called the foreground thread; the global 
state holds information about all the others. 

We call future state environment the state environment futures holding thread 
information. Such an environment simply maps each thread identifler into its stack 
and value stack, and belongs to T ^ (S x V). 

Implementation Note 2.5 (global state resource limits) In an implementa- 
tion the following resource limits hold: 

• each global environment binding occupies a constant amount of the memory 
resource; 
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• each memory cell occupies a constant amount of the memory resource; 

• each defined procedure occupies a constant amount of the memory resource; 

• each defined primitive occupies a constant amount of the memory resource; 

• each thread which is either running or being waited by a join expression in a 
background or foreground thread (§2.5.1) occupies a constant amount of mem- 
ory. 

An implementation may also limit the number of threads running or being 
waited for (as above) existing at any given moment, independently from mem- 
ory usage. 

In all the cases above, some implementations may also scale the total amount of 
occupied resource by a logarithmic factor. □ 

2.4.9 Configurations 

A configuration contains information about the foreground thready and a global 
state; of course the global state, among the rest, holds information about the other 
'^backgrounds threads. 

The set of all configurations is S x V x F; we usually represent configurations with 
the letter Xi possibly decorated; since configurations are potentially complex when 
we show their three components we always omit commas to reduce the visual clutter. 

Evaluating a given expression eh in a given state T entails building an initial config- 
uration (e/i,0) I T: An initial configuration always has a main stack made by just 
a non-holed expression coupled with an empty local environment, and a value stack 
made of just one separator. 

Final success configurations contain an empty main stack and a value stack 
lCnCn-i...C2Cil holding the zero or more elements of the result bundle in a reversed 
sequence, preceded and followed by a 'T separator — the bundle inversion phe- 
nomenon being a consequence of the LIFO evaluation style. 

For example, assuming a "reasonable" + primitive and some state F, we expect that 
by evaluating starting from the initial configuration ([primitive +AA(2)^^ ■^{'^)h2^f^o^ 
0) I T we eventually reach a final success configuration {) lM{5)l F'; it is possible 
to have F 7^ F' because of background threads already started in F. 

2.5 Small-step dynamic semantics 

We are now finally ready to specify eo's dynamic semantics. 
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2.5.1 Small-step reduction 

We need to formalize the intuitive notion of reduction. Given two configurations 
X and x', we say that x reduces to x' and we write "x — *E x'" according to the 
following definition: 

Definition 2.6 (small-step reduction) We define the small-step evaluation re- 
lation 

— >E c (S X V X r) X (S X V X r) according to the rules on pp. 32-33. In the 

rules we always assume n ^ 0, with the convention that an indexed sequence with 
left index 1 and right index is empty. 

Each name has associated a name, written on the left in brackets. □ 



[constant] 



(ch, p).s IV t—^eS iciv r 



" [^ariaWe] (^^^ ^^.^ IV T S IdV T ^global-environment [p] : X ^ c 

[le^e] ^j-^^^ a;^...a;^ be en^ in e^J/io, p).^ T — >-e (e/j^, /9).([let xi...Xn be □ in e^J/K,, p).S IV T 
([let xi.-.Xn be □ in e^ij/io, ?CmC^_i...C2Ci2F F — >e {eh2, p[xi <->■ ci,X2 <->■ C2, ...,Xn <->■ Cn]).S IV T 

^""^^^^^ {[call f eh,...ehJho, p)-S IV T ^e {eh,, p)-ieh„, p).([call / ajfto, 0).S ItV F 

[callc] J Q]^^^ p-j ^- lc^lc^_-^^l..lC2lCi4V r *E{eh, P[X1>-^ Ci,X2>-^ C2,...,Xn-l>-^ Cn-l,Xn>-^ Cn]).S iV T P^°"'*"^^= ^ ^n, eh) 

O r . . . 1 

cr [primitivegj 



^ ([primitive TT e/ii...e/i„]/io, ^V" T — >e (e/n, p)-{eh^, p).([priniitive tt Dj/io, 0).S l^V T 
to 

[primitivej — — — — . , , , ^ g w / / -n/ rprimitives(7r)(ci, c„, r) = <c;, c/^, r'> 

^ ([primitive TT Dj/^o, ?Cn2c„-i2-.?C22ci?|F T — ^e S \d^c'^_^...C2c\lV T' 

tr 

^ " ([if e/ii e {ci...c„} then en^ else e/igj/^o, W T — >e (e/^, p).([if □ e {ci...c„} then else eha]/^,, p).^ ?F T 
o 

I ^^^'^^ ([if □ e {ci...c„} then e;,, else e;,3]^o, p).S IdV T ^e {eh„ p).S IV F ^^i^i-^"} 

I L^*"-' ([if □ e {ci...c„} then else eftgjho, p).5 IdV T ^e {ehs, p)-S IV F ^^l^i-^"/ 



CO 
CO 



w [bundlee] ([bundle eft,... e^Jfto, p).S IV T {en,, p)...(eft„, /9).([bmidle ajfto, 0).S iW T 

I L un ecj ([bundle □] p). 5 lCnlCn-il..lC2lciltV F — *e S lCnC„-i...C2CilV F 

^ [forke] ([fork/efe,...eftjfto, F {eh,, p)-(eft„, p).([fork / Dlft^, 0).5 F 

[forkc] —7 p :=rr- ^—7 — fresh t, Tprocedures : / {xo--Xn,eh) 

([fork/n],„ p).S lCnlCn-il..lcMtV F ^E S lT{t)lV r[*7,£,-/[^°^^W'^^^^^'--^"^'="])')]) 



I ^^"^""'^ {[joineh.U, P)-S IV F^E{eh„ p).([join ajft^, p).S IV F 

o 

([joinDK, p).5 imiV F^eS mV F ■ ' ^ ^0' 



St Vt F^ES't VI F' 



s V f^es V F't;:i'ks'^] 



iTfTT^ — Tfutures '■ t ^ {St, Vt) 
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It is easy to classify rules into four sets according to the holed expression case 
in the top stack pair, if any. We have: 

• the basic rules [constant] and [variable] ; 

• expansive rules, one per non-holed expression case, named after the form with 
an "e" subscript; 

• contractive rules, one per holed expression case (except for the conditional, 
which needs two contractive rules), named after the form with a "c" subscript; 

• the parallel rule [\\] in the end, standing apart from all the others. 

The core ideas of the evaluation are simple, and strongly rooted on the inductive 
nature of the expression syntax. Rule groups help to highlight the quite pleasant 
symmetry of the system: 

• base: we evaluate a "basic" expression found on the top of the stack by popping 
it and pushing a corresponding value onto the value stack; 

• expansion: before we can evaluate a non-basic expression on the top of the 
stack we need to evaluate its sub-expressions: so we replace the expression 
with its holed counterpart, and push its subexpressions on the stack on top of 
it, in an order such that the first one to be evaluated end up on the top; 

• contraction: if a holed expression is on the top of the stack, this means that 
we have just finished evaluating its subexpressions: pop their values from the 
value stack, pop the holed expression from the stack, and proceed: according 
to the case this can mean pushing a further subexpression onto the stack or 
pushing results onto the value stack; 

• parallelism: the parallel rule lets us concurrently perform a reduction in a 
background thread, whenever possible. 

The LIFO policy outlined above enforces a rigid call-by-value, depth-first left-to-right 
evaluation strxitegy. We find that having such a simple and predictable evaluation 
order is very useful for both programming and reasoning about programs. 

[constant] is trivial. 

In [variable] it should be noted how the (topmost) local environment prevails 
over the current global environment in the variable rules. Of course the rule cannot 
fire if the variable is unbound. 

[lete] is simple enough: a let block is evaluated by first pushing the let-bound 
expression e^^ ; when such evaluation eventually ends producing a bundle in the value 
stack the let contractive rule can fire, assuming the bundle dimension is sufficient: 
then the holed let expression is replaced by the let body on the stack, with an 
updated environment in which the first n bundle components are named, and all 
m of them are popped off the value stack, implementing the behavior described in 
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§2.1.5. It should be remarked that the holed let expression "disappears from the 
stack" as soon as its body is pushed. This behavior is useful for potentially tail- 
position subexpressions: after we reduce a let block to its body the let block itself 
can be disposed of, saving stack space. 

[letc] just consists in replacing the call expression on the top with its holed 
counterpart (with an immaterial local environment), and pushing actuals on top of 
it, so that they will be evaluated starting from the leftmost one, all in the same 
local environment of the call. When actuals are evaluated the call contractive 
rule has the opportunity to fire, provided that the value stack contains a topmost 
activation with exactly as many 1-dimension bundles as the (current!) number of 
parameters of the called procedure, and of course provided that a procedure with 
the appropriate name exists. If that is the case the holed call expression is replaced 
with the procedure body, and the local environment with an environment containing 
only the parameter bindings. It is crucial here not to extend the call-time local 
environment, since we want to prevent nonlocal visibility, for efficiency reasons. In 
a similar vein to the let case, a tail-position holed call is replaced by the called 
procedure body. 

[primitiveg] and [primitivej are very similar to their call counterparts; the 
contractive rule cannot fire if the primitive name is not bound, or the primitive 
function is undefined. Notice that the primitive function is allowed to return a 
new global state, and the contractive rule effectively establishes it for the resulting 
configuration. 

[if e] simply replaces the topmost expression with a holed conditional, pushing 
the discriminand subexpression on top of it; when the discriminand is completely 
evaluated, either of the two if contractive rules [if c] and [iff] may fire, provided 
the discriminand yielded a 1-dimension bundle: if the value belongs to the condi- 
tional case set, the then subexpression replaces the holed if; otherwise, the else 
subexpression does. The conditional expression is replaced by one of the branch 
subexpressions without consuming stack space, which is useful in tail contexts. 

[bundlee] resembles [callg] and [primitiveg]; again the empty environment 
associated to the bundle holed expression is immaterial, [bundlej, if the correct 
number of 1-dimension bundles is on the top of the value stack, replaces them all 
with a single bundle holding all the values. 

[forke] is essentially identical to [callg], [primitiveg] and [bundleg]. [forkc] 
is more interesting: if the actual parameter result bundles are 1-dimensioned and 
correct in number, they are simply replaced by one future on the value stack, and 
the fork evaluation terminates immediately: the concurrent evaluation will take 
place asynchronously in a new thread created for the purpose, and associated to the 
future identifier in the future state environment. Notice that the thread identifier is 
also visible to the new thread as the zeroth parameter, to be used in personalities 
for "self -thread-name" forms or thread-local variables. 

[joing] replaces the topmost expression with its holed counterpart, pushing its 
future expression over it; [joing], provided that a 1-dimension bundle is on the top 
of the stack, that bundle contains a future value and the thread corresponding to the 
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future terminated, returns the result from the thread, [joinj cannot fire until the 
asynchronous thread has terminated. 

[||], provided that that a configuration obtained by making a background thread 
the foreground thread could reduce, allows to perform the reduction "concurrently", 
in the future state environment. 

It is easy to see that in the contractive rules [callj, [primitive J, [bundlej and 
[forkj the local environment associated to the holed expression will in practice 
always be empty, for reachable configurations. 

The role of 'T value separators should be clear at this point: the values of the 
same bundle are stored on the value stack sequentially, without any separators in 
between — and again in reverse order, because of the LIFO strategy. Separators 
help establish the correct conditions for a rule to fire, so that no bundle of the wrong 
dimension can be used. 

The motivation for activation separators "|" is similar but slightly more subtle: 
the problem is being able to distinguish a local, temporary bundle from a surround- 
ing bundle which is being built on the value stack. Without such explicit markers 
it would be possible to pop the "wrong" number of values from the value stack. 

Moreover a procedure can be redefined, or even defined for the first time, by 
one of its actual parameters. We only define semantics if the number of the passed 
parameters is correct, but their good number cannot be determined after all of them 
have been evaluated: hence, before letting a contractive rule fire, we have to check 
that the topmost objects in the value stack be all and only the actual values. 

At this point it may be worth to remind the reader of Implementation Note 2.2: 
markers do not necessarily need to be represented and checked for at run time in an 
efficient implementation; quite the opposite, by specifying that some case yields an 
error we free ourselves from any implementation constraint. 

For this reason we intentionally let, for example, "wrong arity" be an error condi- 
tion (§2.5.3) instead of specifying some "fallback behaviour" such as ignoring extra 
arguments or providing defaults for missing ones: in practice an efficient imple- 
mentation will need to reserve stack frame slots or registers for return addresses, 
garbage collection structures or for some other implementation bookkeeping pur- 
pose: of course passing the wrong number of parameters will likely interfere with 
these conventions. We do not want this to be made more difficult or less efficient just 
because of the need of implementing a specific behavior, whose utility was dubious 
in the first place. 

Unfortunately an implementation cannot let configurations grow to an arbitrary 
complexity: 

Implementation Note 2.7 (dynamic execution resource limits) Each instance 
of the following items occupies some memory in an implementation (See Implemen- 
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tation Note 2.3): 

• a stack item; 

• a value stack item; 

• a local environment binding. 

Some implementations may further limit the stack item and value stack item number 
to another smaller constant, independently from memory usage. □ 

2.5.2 Sequential reduction 

As we have just remarked in §2.5.1, we value the predictabiUty of £q semantics, with 
its well-specified evaluation strategy. In the same vein determinism in an evaluation 
relation is a desirable property. 

It is easy to observe that, save for the parallel rule, the reduction relation is in 
fact trivially deterministic, up to the (immaterial) choice of thread identifiers. 

Definition 2.8 (sequential small-step reduction) We define the sequential small- 

step evaluation relation — c (S x V x F) x (S x V x F) according to the 

rules on pp. 32-33, minus the parallel rule. □ 

Interestingly, a sequential reduction can still work with fork and join, and futures 
can be passed around and even created anew or joined if their result is ready: since 
the only source of non-determinism is the actual concurrent reduction, as long as 
no background thread "advances" it is possible to work with futures using only 



2.5.3 Failure 

In §2.5.1 we have explicitly shown that there are cases in which the small-step seman- 
tics is undefined because rule premises cannot be satisfied. After having formalized 
the notion of "correct reduction", here we are going to exactly specify and classify 
failure conditions. 

Definition 2.9 (error configurations) We define the error configuration relations 
fails because of environments, written as " — >£ ;5^x" o-n-d fails because of dimen- 
sion, written as " — >£ all subsets of the set of configurations S x V x F, 

by the following rules: 




E — ■ 



{xh, p).S IV r ^E^y. 



X ^ (ioTO(rgiobal-enviromnent [p]) 



([let xi...Xn be □ in eh.,]ho, p)-S V T — >e ^# 



|m : [m^ n /\V = lc,nCm-i--- 



C2CilV') 
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^(rprocGdures : / ^ • ■ .X„ , B/j) aV = ICnlCn^ll ■ ■lC2lCiliV') 



([call / a]f,^, p).svr^E^# 

rprimitives(7r) n ^ m A V ^ lCnlCn-ll-lC2lCiltV' 



([primitive vr □]/,(,, p).S V T — >e ^# 

IclV 

([if □ e {ci...c„} then else Ch^jho, P)-S V F — >e ^# 

([bundle p).S V r — ^E 

^(rprocedures : / ^ {xo---Xn,eh) aV= ICnlCn-ll ■ ■lC2lCiliV') 



([fork / p).s vr^E^# 

([join a]^^, p).5Fr--^E^# ^ ^ 

T/ie fails because of a primitive relation — >e ^p^SxVxFzsa superset 

of the relation defined by the following rule: 

([join p).S IclVr ^E^p ^ "^^^^ 

T/ie exact definition of — >e relies on the specific set of available prim- 
itives, which we intentionally leave open. 

We define the generic "fails" relation, written as " — >e ^ ", as the union of the 

specific failure relations: ( — >e ^) = ( — >-e '«^x) ^ ( — *E '^^p) ^ ( — *E 

We also call final failure configuration a configuration that fails. A final con- 
figuration is either a final success configuration or a final failure configuration. □ 

Since the definition above does not mention background threads at all we have 
that failure in a background thread does not propagate to any other thread. We 
chose this solution in the interest of simplicity and realism for a core language such as 
Eq, which should refiect the behavior of system-level facilities. Of course higher-level 
personalities are free to implement more complex policies, as hinted at in §2.5.4. 

Since failure never propagates to other threads in £q, there is no need for alter- 
nate "sequential" relations for failure. 

As a further point of note, above we have chosen to classify the failure of join- 
ing an object different from a future as a primitive error, because of the strong 
analogy of the condition with a primitive "wrong parameter" error 

Many actual primitives also fail for some values of their parameters, even when 
they receive the correct number of them. For example a division primitive "^" might 
fail on a zero divisor; writing "AA(0)" for zero as in §2.2, we get: 



^It is easy at this point to mistake such an error for a type error. The difference is actually 
subtle, and will be dealt with in §3.3.1 
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([primitive ^ n]/,^, p)-S lAf{0)lcltV F — >e 

We intentionally omit a list of all the specific cases of primitive failure, a complete 
specification belonging in the primitive definition — with the only constraint of 
having failure rules covering all possible failure cases; in other words, given a set of 
parameters a primitive either fails or returns a result but no other behavior such as 
divergence is possible, as specified by Axiom 2.10, which we are now ready to state: 

Axiom 2.10 (primitive "totality") For any primitive tt such that TprimHivesi'^) 
n ^ m in some state T and for each sequence {ci, Cn), exactly one of the following 
holds: 

• there exist c[, c'^,T' such that Tprir„itives{'^){ci, ...,Cn,T) = {c[, ...,c'^,T'); 

• for any Hq, S, V we have ([primitive vr nJ/iQ, p).S IciL-lCnltV T — >£ ^p. □ 

We can prove a result in the same spirit for general expressions: any given 
configuration either can be reduced for at least one more step, or it immediately 
fails; but it is not possible that a non-failing configuration does not allow reductions 
(unless joining a future), or that a configuration simultaneously fails and allows a 
sequential reduction: 

Proposition 2.11 (reduce xor fail xor wait) Given any reachable configuration 
X = {eh,p).S V T we have exactly one of the following: 

• there exist S' , V and T' such that [e^, p).S V T — >| S' V T' ; 

. {eh, p).s vr-^E^; 

• there exist t e T, e V such that x = ([join □]/,, p).S lT{t)lV' T. 

Proof (sketch) Since the main stack is not empty, x is not terminal. 

We are dealing with — >| rather than — >e , hence [||] cannot fire, 

by Definition 2.6. 

In any configuration where the top expression is an non-holed expression 

II 

except for the variable, we may apply an expansive rule or [constant] for — >^ 

, leading to an evaluation step: all such rules can always fire independently of 

subexpressions, the state of the environments or the stacks (Definition 2.6). 

If X = ([join nj/j, p).S lT{t)lV' r for some t, V' then the thesis follows trivially. 

In the remaining cases the top expression is a variable or a holed expression, and 

another disjoint set of rules applies. Then one of the contractive rules for — >£ , 

or an error rules for — JS^Xj — *E ^# and — >e in Definition 2.9 

must apply. 

In all the cases above it is easy to see that each configuration matches the premise 
of exactly one rule (in particular, since x is reachable, the value stack must have I 
on top); the only nontrivial case is [primitive vr □]/,, in which either [primitive J 
or a — >£ rule applies because of Axiom 2.10. ■ 
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2.5.4 Error recovery and personalities 

At the level of Eq all errors are fatal. In the interest of simplicity and efficiency, no 
mechanism is provided for handling a case of failure by recovering or retrying. Any 
such machinery can be defined in high-level personalities by checking for failure 
conditions at run time with explicit conditional expressions to be automatically 
generated; in this way it is possible to completely prevent Eq failures from ever 
occurring, if so desired. 

As usual and in the same spirit of typing, the personality implementor has the 
freedom of choosing an efficient model where failures are always fatal, or a friendlier 
alternative where the personality presents an ordinary Eq configuration as an "error" 
state from which the conventionally "normal" execution can resume. 

Like for static typing it is also possible to check for the possibility of failures "stat- 
ically" at code generation time, and generate fast code under the assumption that 
some kind of failure is impossible. The next chapters hint at how one can define 
such analyses. 

2.6 One-step dynamic semantics 

When dealing with "toplevel" eq expressions, often we are less interested in the 

small-step evaluation relation — *£ than in its iteration: where only the final 

configurations (if any) are of interest it is convenient to completely ignore stacks and 
value stacks, restricting our attention to an expression, an initial state, its result 
bundle and the terminal state. 

Definition 2.12 (one-step convergence) We define the one-step operational se- 
mantics relation for expressions __ JJ-e — ^ (E x f) x (C* x f) by the rule: 

(efe,0) I lCn...Cil V 

e/i r <ci...c„> r' 

Similarly, we define the one-step sequential operational semantics relation for ex- 
pressions JJ,| c (E x r) x (C* X r) by the rule: 

{eh,0) ?r^|+<> lcn...cil V 

r4<ci...c„> r' 

When we have that "eh T {ci...c„) T'" we say that eh in T converges to {ci...c„) 
in r'. In the same way when we have "ch T (ci...c,i) V" we say that Ch in T 
sequentially converges to {ci...Cn} in T' . 

We may omit state names or results when irrelevant in context. □ 
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It trivially follows from the determinism of — >^ that __ __ is also a (partial) 

function. 

Notice that according to our definition a reduction chain of — >-e may 

converge even if some background thread potentially runs forever, when a finite re- 
duction chain exists for — >^ , and hence also for its sub-relation — >£ 

It is also useful to speak of the eventual failure of an expression in a state, ig- 
noring the zero or more reduction steps leading to the failure configuration, and the 
specific failure configuration as well: 

Definition 2.13 (eventual failure) For each error-configuration relation — >e 

jKj, with f e {"", "X", "P", "#"}, we define a corresponding eventual failure relation 
— E X f by the meta-rule: 

eh r 

If we have that "eu V ^^^^f" with f e {"", "P", "#"} we respectively say that 
Ch in r eventually fails, eventually fails because of environments, eventually fails 
because of primitives, or eventually fails because of dimension. □ 

Finally, we characterize looping expressions and states: 

Definition 2.14 (divergence) We define the divergence relation __ He ^ E x F 

the following way: 

let e/j be an expression and T be a state; then we say that e^ diverges in F and 
we write "e^ T 1|e" if for any configuration x such that (e/j,0) I T — x there 
exists another configuration %' such that x — *E x'- ° 

We defined divergence with the parallel reduction relation — >e rather than 

its sequential restriction; hence our notion of divergence covers both "busy looping" 
in the foreground and waiting forever for a background thread. 

It may is worth to stress how, for example, the sentence "e/j in F does not 
converge" has a different meaning from "e/j in F diverges", since it is possible that 
e/i in F eventually fails. The same problem holds for the phrases "converges" and 
"eventually fails". 

We will avoid such wording in the negative. 

2.7 Summary 

We conceived the Eq core language for expressivity, efficiency and ease of formal ma- 
nipulation: £o is powerful but idiosyncratic and unsuitable for direct use by human 
users, who are expected to only access it through extensions. 

After dealing at length with design issues and providing a rationale for eq language 
we proceeded to formally specify its syntax, semantics and error conditions. 
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We also described a sufficient set of conditions under which an implementation is 
compelled to respect the specified behavior, allowing for both inefficient but friendly 
and efficient but unsafe implementations. 

The small-step operational semantics is relatively simple and has a deterministic 
sub-relation obtained by simply ignoring one rule. 

Wc defined the "one-step" semantics, hiding the complexity of stacks, by iterating 
the small-step reduction relation. 



Chapter 3 

Reflection and self-modiflcation 



The presentation of £q in §2 showed the state component as abeady containing 
procedure and global bindings, but did not illustrate any explicit way of updating 
either. 

In this chapter we will start by discussing global definitions, and then proceed to 
clarify what we mean by a "program"; our somewhat unusual solution has important 
implications on how programs are loaded, saved, and compiled. 
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3.1 Global definitions 

The expression semantics given in §2.6 does not explicitly mention any functionality 
to alter the set of procedure or global bindings that we have always considered as 
already defined, as part of the state; anyway such functionality is clearly needed: if 
not for anything else, at least for defining new recursive procedures, as expressions 
can not express recursion without referring global procedures; and, of course, global 
definitions are useful for reasons of modularity. 

A "traditional" solution to this problem would consist in adding toplevel forms to 
eo as a new syntactic category: toplevel forms would comprise a procedure defini- 
tion form and a non-procedure definition form; the program would then become a 
sequence of toplevel definitions, possibly followed by a main expression returning a 
final "result". 

We have several good reasons to reject this simplistic notion of a fixed program 
to be written from start to finish and then executed: 

• in the spirit of Forth"^ and Scheme, we want to also support interactive systems 
interleaving user input with evaluation and answers; 

^It is not by coincidence that we mention Forth first in this case. One important lesson 
of Forth is how complex programs can be written even with no support whatsoever for typing, 
provided that each small component is individually testable. We extrapolate the following motto 
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• having two specific definition forms is adequate for Eq but not for its exten- 
sions: a personality may need global definitions for new entities such as classes, 
exceptions, or types. Even syntax is not fixed: new entities, and the associated 
syntactic extensions, would be defined at the top level, to be freely used from 
that point on; 

• adding toplevel forms to Eq is not necessary, since expressions are already 
powerful enough to express state updates using primitives or procedures; 

• a powerful language should let the user update global definitions from any 
program point, not just at the top level. 

For all these reasons we will simply assume the presence of the global-definition 
or self-modification procedures state : global-set ! and state : procedure-set ! , 
to be defined later in §5.4.1.3, p. 95. We also assume the presence of their com- 
panion reEective procedures state : global-get, state : procedure-get-formals, 
state : procedure-get-body, state : global-names and state : procedure-names. 

No toplevel forms are needed^: definitions are expressions like any others, and 
can be performed at the top level or just as well within other expressions. 

3.2 Programs and self- modification 

The last point in the dotted list above, easily the most controversial, illustrates well 
the tension between our will of providing an expressive system, and the desire of also 
keeping the language easy to reason about and efficient — in mainstream terms, the 
dynamic vs. static debate. 

At a first look Eq with global-definition procedures appears fiatly sided with the 
"dynamic" party, allowing any program to capriciously modify itself at run time; for 
example in §2.5.1 at p. 35 we even considered the case of calling a not-yet-existing 
procedure which is created by one of its actual parameters. Indeed, we will find use 
for creating procedures from other expressions (§5.4.4.4); but whenever possible we 
would still prefer not to renounce to better intellectual manageability, and efficiency. 

As a reasonable compromise and a way out of the dilemma, it is possible to freely use 
global- definition procedures to let the program reach a final "static" form; after that 
point, the program may be analyzed to check for properties and compiled efficiently, 
under the assumption that no more self-modifications will occur. 

from our experience of working with ML, Lisp and Forth: the weaker the type system, the more 
important a Read-Eval-Print Loop. 

^Global-definition procedures are not particularly friendly to use directly, since the user has to 
pass expressions as parameters, and expressions are relatively complex data structures which need 
to be built. However a friendly definition form is not hard to define on top of global-definition 
procedures, by using a macro (§5). 



3.2. Programs and self-modification 



45 



3.2.1 Programs 

The most convenient notion of a program, for our purposes, is somewliat unusual. 
Given a state and an expression, we can imagine to "freeze" tlie computation state 
and someliow generate a snapsliot of the current state, plus the expression. 

"Executing" a program then means to fire up evaluation on the saved expression 
from the resumed state: 

Definition 3.1 (program) Let T be an Eq state and Ch be an Eq expression; then 
we define their corresponding program as the pair (T[f^^^^^^], e^) e F x E. □ 

We intentionally disregard the background threads in Tfuturesi in order not to have 
to deal with execution stacks or partial expression evaluation. Background threads 
do not look particularly useful anyway in this context, since the main idea is simply 
to use global-definition procedures to have the program self-modify into something 
which contains every needed auxiliary procedure and data, before the "main ex- 
pression" can be finally evaluated; the main expression itself will be free to use 
background threads. 

Of course there is no guarantee that a program, when executed, will not start 
self-modifying again. 

3.2.2 Static programs 

A static program is a program which, when executed, never self-modifies: 

Definition 3.2 (static program) Let (F, e/i) be a program; then we say that it is 
semantically-static or static if in no configurations reachable from evaluating in 
T, the global environment or the procedure environment are different from the ones 
in r. 

More formally, a program (T, eh) is static if for all (5' V T') such that {ch, 0) ! T 

>l {S' V V) we have that Tgiobal = Tgiotal a^*^ Tprocedures = Tprocedures- ° 

We consider re-defining a global to be self-modification: anyway the user can still 
define mutable variables in the style of ML and Forth in a static program, by adding 
one indirection level so that a global maps to a memory cell, whose content can be 
updated any number of times without affecting its identity. The value of imperative 
variables would them be stored in the memory state environment (§2.4.3), which 
does not affect staticity. 

Interestingly, the use of reflective procedures is not problematic even for a static 
program: a static program can safely read its own globals and procedures, which 
are constant by definition. 

Our semantic versus syntactic naming convention is important and deserves some 
comments. The convention comes from standard garbage collection jargon [98], and 
is used to distinguish between a datum which will not be accessed in the rest of 
the computation from a datum which can not be reached by traversing pointers 
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from the roots. It is undecidable wlietlier a lieap object is "semantic garbage", so 
garbage collection works by recycling "syntactic garbage", a conservative decidable 
approximation (any piece of semantic garbage is also syntactic garbage). 

Like the property of being semantic garbage our semantic staticity property 
is trivially undecidable, so it would be tempting to define a notion of "syntactic 
staticity" involving the use of global-definition procedures in reachable code. Such 
attempts are doomed to fail in Eq, because of the way global-definition procedures 
are defined (§5.4.1.3): in practice, at least with our current set of primitives'^, it 
is always possible to modify procedures or globals with ordinary memory stores, 
bypassing the "high-level" procedures for program self-modification. 

Syntactic staticity properties are definable in typed personalities, where dynamic 
or static checks prevent the user from writing at arbitrary memory addresses. 

In accordance with our open-ended design principles we do not consider an "er- 
ror" for a program to be self-modifying; yet staticity remains desirable since self- 
modification makes most analyses impossible, prevents many compiler optimizations 
including inlining, and indeed challenges the very idea of compilation^. 

A static program can instead be compiled and optimized in a traditional way and 
as a consequence of our design a whole-program approach, lending itself to global 
optimizations [94], feels particularly natural. Since the expressions occurring in a 
fixed program are all known, it is easy to build global tables with handles as keys 
(§2.1.3), to perform any kind of analysis^. 

Particularly in an untyped context, where users are supposed to be competent, 
it is reasonable to consider a certain program as semantically static when users de- 
mand so by requesting to analyze or compile a program in a modality which takes 
advantage of staticity. We stress once more how all such functionality, including 
compilers, can be written in the language itself as part of a "library", and is not 
specially hardwired in the system in any way^. 

^It would be possible to bootstrap the language with a different set of primitives (§5.4.1.3), 
so that state : global-set ! and state :procedure-set ! are primitives themselves and do not 
depend on others. However such a solution would be unrealistic for a practical implementation, 
where identifiers and expressions are data structures like any others. 

As a slightly more subtle point, a syntactic staticity guarantee would also have to prevent 
destructive modification of expressions, which is possible as well in our implementation of §5. 

*It is possible to compile only parts of the code, as several Lisp systems do, but the interaction 
between interpreted code and compiled code complicates design, also requiring invalidation and 
substitution of compiled code. It is also possible to have a JIT, or more simply a compiler to be 
executed at run time which translates every expression as soon as it is generated, like SBCL does 
[73]. 

useful idea for which we cannot claim novelty: the idea of attaching user-defined data to 
syntactic objects, now mostly popular because of Java "annotations" (http : //download. oracle . 
com/javase/1 . 5 . 0/docs/guide/language/annotations .html), was already quite explicit in Mc- 
Carthy's 1959 LISP [51]. 

®We did not implement a complete compiler yet (§5.4.5) but we have a custom bytecode virtual 
machine, and the beginning of native bindings. No fundamental obstacle in implementing an eq 
native compiler is apparent, and we plan to do it in the following months. 
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Since the set of procedures in a static program is fixed and so is its main expression, 
it makes sense to define a notation to show a program in a "linear" (and leaner) 
form. It is also convenient to speak about individual program components using the 
"_ e _" operator, without making state environments explicit. 

This will be useful in §4, when we will describe a static analysis in detail. 

Definition 3.3 (static program Unear syntax) Let p = (F, eh) be a static pro- 
gram. // Tprocedures = {/l ^ (xi^ . . , C/j J , /2 ^ {x2^. ..X2^^., th^) , fm ^ 
{xmi---Xm„ T^hm)}: then we can write the whole program as: 
"[procedure (fi xi-^...xi„^) e/^J 
[procedure if2 X2-,...X2„,^) ey,^] 

[procedure (/m Xmi-.^m^^) et^] 
eh". 

We also write: 

• "[procedure (/ xi...Xn) e/^J e p" to mean that T procedures ■ f ^ (a:i...x„, e/ij; 

• "ch e p", to mean that eh is the main expression of p. □ 

3.2.3 When to run analyses 

In traditional languages the act of performing a "static" analysis means running some 
procedure over the syntax trees from a compilation unit before the unit is compiled 
or executed; but with our program notion above blurring phases and units, the very 
idea of "static analysis" in the context of e becomes fuzzy. 

Activities closely analogous to static analysis remain meaningful: for example in 
a statically-typed personality the procedures and global variables which are part of 
a program at a given point can still be usefully checked for type safety, independently 
from the way each entity was defined in the past evaluation history. 

Some analyses may be attempted even for non-static programs. The problem 
becomes rather the point in time at which to run analyses: since no "end point" 
is apparent, no obvious solution comes to mind. A personality might run some 
or even all the analyses right after evaluating each toplevel expression; as a more 
radical hypothesis an advanced editor such as Emacs can certainly be programmed 
to communicate with an interpreter after each character modification, demanding 
to run analyses'' and visualizing their continuously updated results. 

Of course any similar solution needs to cope with the possibility of yet unre- 
solved forward references, which cannot be prevented in general due to the mutual 

^The difficulty of tfiis approacfi is due less to program analysis than to the difficulty of defining 
the semantics of incremental modifications to non-contiguous points of a program. This seems 
hard to accomplish for e without rebuilding the entire state from scratch at every change, at a 
prohibitive cost in performance; caching mechanisms can be conceived for some personalities to 
make such operations more efficient. 
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recursion inlierent in £q procedures: it is to be expected and regarded as normal 
that analyses fail at some points where the current set of global definitions is "open". 

One likely appropriate time for running analyses is right before compilation, since 
no unresolved forward references would be present at that point; but in e com- 
pilation does not necessarily mark any "terminal" point of the evaluation history, 
either. It seems reasonable to also allow analyses to be run at any point, on demand. 

We will see in §5.4.1.5 how transforms may be conveniently used to automatically 
associate analysis to global definitions. 

3.3 Unexec 

Our programs, be they static or not, are in fact "system images", which would be 
convenient to write to disk for later execution, or even to be transferred to different 
computers; the main expression to be saved as part of the program might simply be 
a call to the REPL procedure, itself calling the interpreter: that way restoring the 
system image would open an interactive session in the saved state. 

One could even envisage "snapshots" as a way of saving the current system state 
before performing an experimental and potentially destructive modification in an 
interactive way: if the modification fails, the user can revert to the old state by 
loading the last snapshot, presumably a much faster operation than re-building the 
previous state by repeating the same self-modifications which generated it in the 
first place from the initial state. 

The functionality cursorily described above resembles the Emacs "unexec" hack [47, 
§Building Emacs]. Emacs consists of a relatively small Lisp interpreter written in C 
which contains the core primitives, plus the bulk of the system implemented in Lisp; 
in order to avoid loading hundreds of Lisp files at every startup, the native Emacs 
executable is built so that it fires up in the state which would he produced by loading 
the initial Lisp files. Doing this in C with native processes is tricky and requires 
system-specific low-level code. Our intent is similar, but our implementation will be 
much simpler and largely machine-independent. 

The general idea of our unexecing strategy is to simply marshal data structures 
into a linear representation which can later be read back in an exec phase, based on 
unmarshalling. 

The composition of unexec and exec yields a state identical® to the original one 
up to buffer addresses. 

We cannot claim novelty for this idea, considering for example Hoare's early 

^We are assuming that memory encodes the complete global state, but this assumption breaks 
if the state refers system structures such as open files or sockets: unexec can not reproduce any 
object out of its process address space. 
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intuition in [36, §3.3(2-3)]; in a couple recent systems unexecing exists, but plays a 
less central role than in ours: SML/NJ for example supports a "heap2exec" utility 
[49]; it generates native code,^ yet it can only run on a couple of platforms — 
heap2exec is not by any means "the compiler" for SML/NJ, but rather just one tool 
among others. Unexec support has also been discussed or experimented with for 
Perl, Python and Guile [21]. 

3.3.1 The stuff values are made of 

Since any realistic implementation must work on general-purpose Von Neumann 
machines, it is clear that the implementations of all state environments and expres- 
sions share in practice the same machine memory; and that memory holds the data 
structures we have to marshal. 

Encoding details will be made clear in §5.4.1.3; but even without specifying here 
how each kind of data is represented, we need to specify the memory model followed 
by all our in-memory objects. As a consequence of other design decisions, the actual 
data structures to be marshalled for unexecing will be surprisingly few in number 
(§5.4.2). 

We can ignore background threads, which are not involved in unexecing as they 
are excluded from programs as per Definition 3.1. The remaining values are of only 
two kinds: 

• unboxed values; 

• heap buffer pointers, also called boxed values. 

A machine word, in practice not wider than a general register (32 or 64 bits on 
modern machines), can hold either an unboxed value, or a pointer to a buffer; a 
buffer is a contiguous array of other machine words in heap memory. Pointers are 
always initial: we preferred to simply avoid interior pointers as they may exhibit 
bad interactions with some garbage collection algorithms which we may want to 
adopt in the future [98], even if ours has no such restriction (§6.3.2). 

Unboxed values are often used for Bxnums, which is to say fixed-range integers, 
represented in two's complement on modern hardware. Booleans, characters and 
enumerates also fit comfortably in the range of unboxed values. No provision is 
made for objects smaller than a word: at this level, one word is the smallest repre- 
sentable datum. Complex data containing multiple "fields" usually need to be boxed, 
but if all put fields together fit within the width of an unboxed datum, they can also 
be packed into a single word: from the point of view of the memory model there 
is no difference between a single-field and a multi-field unboxed object. Efficient 
implementations of dynamically-typed personalities are free to reserve some bits as 

^Keeping the two functionalities separate has the advantage of providing a working unexec 
feature also on platform where a native compiler is not implemented. 
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tags in unboxed objects and pointers [35], in the case of pointers exploiting the fact 
that allocation alignment will free at least two or three bits for all buffer addresses, 
on current byte-addressed machines (§6.3). 

Some values are in practice necessarily boxed, notably reified expressions] as a 
consequence in some cases what we informally called a "value" in §2 is actually a 
function of the pointer, which we use as a reference to the entire object to pass 
around, plus all the memory which it refers, closed under the "points-to" relation. 

Values can then be visualized as a graph, possibly containing converging edges 
and cycles. 

It is in practice possible to alter memory to change a boxed datum component 
even when such value does not appear as "mutable" in the semantics, for example 
by using a store primitive on an expression datum. Such practices would entail prim- 
itive failure^^ and hence not respect the hypotheses for implementation guarantees 
(see Implementation Note 2.2); the fact that they are possible does not constitute 
a violation of semantics. 

The following linearized textual format for memory data structure including ad- 
dresses is very convenient for debugging the implementation and also as a generic 
fallback "untyped printer": 

Syntactic Convention 3.4 (memory dump) We dump a given datum into a 
string of colored characters, according to its shape. There are two cases: 

• an unboxed datum is written in green in decimal, as a two 's complement signed 
integer; 

• a pointer is written as a hexadecimal number, prefixed by the conventional 
prefix "Ox"; 

— if the referred buffer occurs for the first time in the data structure ( depth- 
first left-to-right), its address is written in red followed by a dump of its 
buffer elements between brackets, with consecutive content words sepa- 
rated by a space; 

— if the referred buffer has already occurred, its address is written in yellow 
and the buffer content is not repeated. □ 

With some practice it is not difficult to make sense of quite complicated data struc- 
tures by reading memory dumps. Despite not being strictly needed to parse textual 

^°We did not specify explicit rules for our chosen set of primitives, including preconditions to be 
satisfied to avoid failure; however we can quickly hint at a solution: we can imagine that each buffer 
contains an initial boolean tag word, recording its mutability or lack thereof; the store primitive 
will only permit to write mutable buffers, and another primitive will be available to change a 
mutability tag from mutable to immutable, but never the converse. Load and store primitives 
would implicitly skip the tag word in the offset they receive (§5.4.1.3). 

Of course the implementation does not need to actually represent the mutability tag word: Imple- 
mentation Note 2.2 permits us to assume that no failure occurs, and still respect our specification. 
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dumps, color makes it easier for humans to recognize a structure's shape at a glance. 

In most cases (but not all: see the discussion of hashes in §3.3.2.2) the actual nu- 
meric address held by pointers is not relevant for algorithms, and a pointer simply 
"identifies" a certain buffer, independently from its specific placement in memory. 
It is hence reasonable to represent data graphically, ignoring addresses and simply 
using arrows for pointers, multi-slot boxes for buffers, and numbers for unboxed 
data. 

Such "address invariance" is fortunate, since usually we do not have control over 
buffer addresses at allocation time^^, hence we cannot reliably re-create a buffer 
at a specified memory address. What the composition of marshalling and unmar- 
shalling will accomplish, then, is the reproduction of the data structure graph (Fig- 
ures 3.1 and 3.2). For example, after marshalling and unmarshalling, the data 
structure dumped in Figure 3.2 might be "cloned" into 0x26aaaf [0x2899220 [42] 
0x3078920 [0x2899220 0]]. 
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Figure 3.1: A circular list holding the fixnums 57, 3 and -2, whose dump could be 
0x27032d0 [57 0x279ead0[3 0x28a66e0[-2 0x27032d0] ] ] . 
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Figure 3.2: An example of sharing: a two-element list whose elements both point 
to the same one-element buffer, holding the fixnum 42 and using as a terminator. 
One possible dump is 0x29ecd90 [0x2714220 [42] 0x29549f [0x2714220 0] ] . 

3.3.2 Marshalling 

Textual dumps as in Syntactic Convention 3.4 could serve as a marshalling format; 
however our implementation marshals data structures into binary files, for efficiency 

^^System libraries ultimately choose data structure addresses, providing very few guarantees. 
Sometimes the problem is made even worse by deliberate address space boundary randomizations 
performed for security reasons [74] . 
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reasons. Since specific pointer values are immaterial, in marshalling we replace 
them with sequential 0-based identifiers, which enables some minor optimizations. 
The logic of marshalling and unmarshalling algorithms resembles moving garbage 
collecting algorithms such as semispace [98], which have to recursively "clone" data 
structure graphs. 

Similarly to textual dumps, when marshalling, the idea is to recursively trace a 
data structure keeping into account which buffers we already visited; marshalling 
produces a sequence of zero or more "buffer definitions" following by the single main 
object, be it a pointer or an unboxed value. Pointers are encoded as buffer indices, 
following the definition order. 

Conversely, the unmarshal procedure will allocate and fill buffers, and then re- 
solve buffer identifiers into pointers in a second pass. 

In our current implementation each file field is a 32-bit big-endian word^^. The 
binary dump begins with a word holding the number of buffers, followed by the 
same number of "buffer definitions", and finally by the "main object"; each buffer 
definition contains a word encoding the buffer size in words, following by as many 
"elements", each element containing two words: either a tag for an unboxed object 
followed by the content, or a i tag for a boxed object followed by the buffer index 
(in the order of buffer definitions, 0-based); the main object is one further element. 

buffer-no times 
^ ^ , 

buffer-no [element-no ((0|1) fixnum)*)* (0|1) fixnum 

V ' 

element-no times 

3.3.2.1 Boxedness tags 

Up to this point we have assumed that marshalling, and textual dumping as well, 
can discriminate between pointers and unboxed objects; but this it not possible at 
the hardware level. 

On the physical machine pointers are memory addresses, which is to say numbers, 
and as such in principle indistinguishable from unboxed objects such as fixnums. 

Modern hardware and operating systems tend to guarantee that objects will not 
be allocated at very low address, so in practice it may be safe to assume that all 
pointers have a numeric value larger than some constant such as 2^^ [5]; alignment 
causes all pointers to be multiples of the word size, possibly times some other small 
factor; however many large fixnums remain effectively impossible to discriminate 
from pointers. 

^^Using 32-bit rather than 64-bit words helps to avoid relying on non-portable behavior by 
mistake when dumping very large unboxed data. We could reduce tag words to bytes or even 
single bits, but particularly in the latter case it is not clear whether the denser format would 
compensate for the additional required shuffling in terms of time efficiency. Space consumption 
is not a problem: typical unexec binary dumps have sizes ranging in the order of one to a few 
megabytes. 
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The solution is providing a one-bit boxedness tag associated to each datum, plus 
a dimension field per buffer — Dimensions tend not to be overly problematic in 
practice^^. 

The bit can be stored within each word itself, if reducing its payload width 
is acceptable; otherwise a much less efficient but more flexible solution consists 
in representing all objects as boxed two-word buffers, using one element as the 
boxedness tag and another for the payload — We implemented the latter strategy, 
as it was slightly easier to integrate with Guile (§5.4.1). 

In either case, both eq primitives and the memory management system should 
keep boxedness tags into account: this will slow down arithmetic operations and 
possibly complicate garbage collection. On the other hand, some garbage collectors 
(for example OCaml's) already require the same tagging strategy for their own pur- 
poses: where such a collector is used anyway boxedness tags cause no additional 
overhead. 

Since boxedness tags are expensive it is conceivable to provide two different run- 
time libraries, a tagged runtime associating a boxedness tag to every word and a 
length field to every buffer, and an untagged runtime directly using the machine 
representation: dumping, marshalling and unexecing will only be possible on the 
"tagged" runtime, but the "untagged" runtime will be more efficient. One interest- 
ing feature of this solution is that, since unmarshalling does not rely on tags, the 
untagged runtime can always be used as a last stage for a program which has been 
developed on the tagged runtime, before being unexeced for the final time. In case 
of compiled code, a (presumably static) compiled program should probably always 
use an untagged runtime. 

Boxedness tags, when present, can also be used by primitives to perform some 
dynamic checks and prevent out-of-bounds errors in a very crude form of dynamic 
"typing", which however has value when debugging. In this view it might make 
sense to provide three different runtimes: "untagged", "tagged checked" and "tagged 
unchecked". 

Our implementation currently contains only a tagged checked runtime. Imple- 
menting the other runtimes is not hard, being mostly a matter of using C prepro- 
cessor macros to wrap object accesses; we will provide the two missing runtimes as 
soon as we eliminate the dependency on Guile. 

3.3.2.2 Marshalling properties 

Since we have not formally specified marshalling and unmarshalling algorithms, here 
we simply assert their properties without proof, as guarantees to be provided by an 
implementation. 

^■^A memory system organized in the BiBOP style (§6) — not necessarily a garbage collector — 
would permit not to represent them at all in the most common cases. 
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Again, the strong resemblance to moving garbage collectors is not coincidental. 

We first need to specify exactly what we mean as the corresponding substructures 
of an object, "before and after" marshalling: 

Deflnition 3.5 (marshalling correspondence) Let qq be an object which is mar- 
shalled into a binary dump, itself unmarshalled into the object bo- Then, by induc- 
tion: 

• ao corresponds to bo; 

• if a corresponds to b and both a and b are pointers to n-element buffers, then 
the i-th component of the buffer pointed by a corresponds to the i-th component 
of the buffer pointed by b for any ^ i < n. □ 

Marshalling has to "preserve structure", which is to say it has to reproduce the orig- 
inal pointer graph mapping buffers into buffers and unboxed objects into unboxed 
objects: 

Axiom 3.6 Let a correspond to b. Then we have that a is unboxed if and only if b 

is unboxed. For every n e hi we have that a is a pointer to an n-element buffer if 
and only if b is also a pointer to an n-element buffer. □ 

Axiom 3.7 Corresponding unboxed objects are equal provided that they both fit into 
a dump word payload. □ 

Corresponding pointers are not guaranteed to be equal^^, but marshalling "preserves 
equality" without introducing or removing sharing, in the following sense: 

Axiom 3.8 Let ai be a pointer corresponding to bi and a-2 be a pointer correspond- 
ing to 62; then we have that a\ = 02 if and only ifh\ = b2- n 

As a consequence most operations over pointers continue to work with their in- 
tended semantics after unmarshalling, including checking pointer equality — but 
checking whether an address is numerically smaller or bigger then another may 
yield a different result. 

Interpreting pointers as fixnums and doing arithmetic over them, for example 
to compute a hash function, in general will yield different results before and after 
unmarshalling. But using only the unboxed elements of boxed structures yields the 
same results after marshalling, provided overflow is avoided. 

^""We do not want to assert that they are necessarily different, because in practice garbage 
collection might intervene destroying the original object before its corresponding version is built, 
and it is conceivable that under unusual circumstances an unmarshalled object may reside at the 
same address as its corresponding original version. 



3.4. SummEiry 



55 



3.4 Summary 

Instead of hardwiring definition forms into the language syntax, we can keep the 
language simpler by providing procedures to update procedures and global variables. 
These procedures may be used anywhere, and allow for powerful self-modifying code. 

"Static" code, on the other hand, has the advantage of allowing analyses and 
being efficiently compilable. A state where no more self-modification takes place 
can be reached incrementally, by self- modifications. 

Having access to the current global state permits to save a snapshot of the 
system as a data structure, in a way similar to the Emacs unexec hack in terms of 
functionality, but implemented much more simply by data structure marshalling. 

Marshalling relies on boxedness tags, which can be made optional for higher 
performance. 



Chapter 4 



A static semantics for sq: 
dimension analysis 



The core language Eq as described in §2 is much simpler than other formally-specified 
languages such as SML, whose description [58, 57, 59] looks strikingly complex for 
a "small" language; Scheme Standards include non-normative semantics for some 
language subset in appendices [70, 20, 41, 79]; mainstream languages have no formal 
specification at all. 

To make a realistic argument for the practicality of our Eq semantics we are 
going to show an example of its application by formally describing a static analysis 
of bundle dimensions for static programs (§3.2.2), and then proving it sound with 
respect to the semantics. 

We chose to deal with bundle dimensions in this sample analysis because bundles 
are interesting as a slightly unusual feature, but of course dimension analysis has no 
privileged status: as any other static analysis in e dimension analysis can be used 
as in ML for preventing runtime errors at the cost of also rejecting some correct 
programs, or just to obtain warnings, or not at all; and of course any number of 
analyses (or "type systems") can run side by side on the same program; it is up to 
the personality implementor to decide what to do with the results. 
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4.1 Dimension inference 

In analogy with Hindley-Milner type inference [22] we would like to define a proce- 
dure automatically assigning^ a dimension to every expression in a static program, 

^An alternative approach based on checking user-supplied annotations would have been pos- 
sible, since in practice only few expressions will have a dimension other than [IJ, which could be 
assumed as the default case. Inference is however even less obstrusive, and does not seem to require 
a substantially different formalization. 
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where the dimension represents a conservative approximation of the size of the bun- 
dle the expression may evaluate to at run time. 

Intuitively we want to associate dimension "one" to constants such as 42/^,, and 
also to all variables such as Xh-^ since non-singleton bundles are not denotable. In the 
same spirit, a two-object bundle such as [bundle 10^^ [primitive -t- 1/jg 2/jg]/i^]/i2 
would have dimension "two", and of course the zero-element bundle [bundle J/j^ 
would have dimension "zero". 

Anyway by following this line of reasoning alone we get stuck very soon: for 
example, what dimension should we assign to a call to the procedure /I? 
[procedure (/I x) [call /2 x/^J/iJ 
[procedure (/2 x) Xh^] 
[call fl 42h,]h, 

Of course the answer relies on /I's definition, and in particular on the dimension of 
its body. But /I's body consists of a call to /2... It is already clear that dimension 
inference has to work on an entire program, using a fix point construction of some 
sort: in the fashion of type inference, the analysis will deduce a set of constraints 
from a program (for example: fl returns a result with the same dimension as 
the result of /2; /2 returns a singleton bundle; the main expression has the same 
dimension as the result of fl), and attempt to resolve them. 

4.1.1 The dimension lattice (N|,n,u) 

It is easy to see how our dimension domain needs to be at least slightly richer than 
the set of natural numbers IM , for example by looking at the main expression of the 
following program: 
[procedure (/ ) [call / ]/jJ 
[call / 

Since / never returns anything the analysis cannot discover any constraint on the 
dimension of its result, other than a trivial one according to which such dimension 
is equal to itself. We call "_L" the dimension of an expression on which we have no 
constraints, such as the main expression of the program above. 

As it will be made clear below, in practice only some trivially looping expressions 
have dimension _L. From the dimension point of view such expressions are partic- 
ularly unproblematic and easy to combine with others, since they can never cause 
failures thanks to Eq's call-by- value strategy: for example passing a parameter with 
dimension _L to any unary procedure will cause an infinite loop before the body has 
a chance of ever being evaluated, and maybe failing. 

At the opposite end of the spectrum, some expressions are clearly troublesome; 
for example a procedure call with a wrong number of parameters will definitely 

yield a dimension failure at run time, if the expression is reached and parameters 
converge; we assign the dimension "T" to such trivially failing expressions. 

As a slightly more subtle case, and very similarly to Hindley-Milner type infer- 
ence, we need to give if expressions a dimension which is the "synthesis" of its branch 
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dimensions: when the then and else branches have incompatible dimensions, such 
synthesis will be T. For example we assign the dimension T to the inconsistently- 
dimensioned expression [if Xhi e {1,2,3} then lO/ij else [bundle J/j^J/n,; such an 
expression is problematic to compose, because the dimension of the result bundle 
varies according to which branch is taken at run time. 

Our dimension domain is hence made of the natural numbers N extended with the 
two elements _L and T: we call this set \s4j_. We can easily define a partial order _ E _ 
as the reflexive closure of the relation _ IZ _, where _ IZ _ = [J^^^ {(-L, ([ij, T)}. 




Figure 4.1: The flat lattice (iNij, u, n). 



The set Nj^ with the order _ E _ forms a flat lattice: for any a,b e iKlj^, we call 
a u 6 their least upper bound or "join", and a n b their greatest lower bound or 
"meet". 

In the lattice higher values correspond to more constrained dimensions, with _L 
representing the absence of any constraint, [nj with n e IM representing a bundle 
of exactly n elements, and T expressing several conflicting constraints; the join op- 
eration _ u _ is but the "synthesis" mentioned above, yielding the least constrained 
dimension which is compatible with both parameters: joining _L with another ele- 
ment yields the other element, joining [nJ with itself yields [nJ for every n 6 INI, and 
joining [nJ with [mj for n # m yields T; joining T with any element yields T. 

Occasionally we may also use the set N_|_, deflned as IN|\{T}. 
4.1.2 Definition and properties 

We are now ready to formally enunciate dimension analysis, computing a dimension 
for each expression occurring anywhere in the program, and for each procedure. 

Definition 4.1 (Dimension) Let a program p be given. We define in a mutually- 
recursive fashion: 

• The dimension function for expressions, a partial function with signature E ^ 
IMl, that we represent as the relation - :# - ^ Ex ll\l_|_.- 



Xh :# [IJ 



Ch [IJ 
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, ^ * " * " , di E [IJ, for all 1 ^ i ^ n 

[bundle eh^...ehJho ■# N 



di E [mj lor some m ^ n, a2 C I 



[let be in e^ij/jg d2 

TT n ^ m e/^j di ... eh„ :# d„. 
[primitive vr e/ij ...eh„]ho :# [mj 

f:#n^d eh^:#di ... eh„ :# d. 



di E [Ij, for all 1 ^ i ^ n 



[call / eh^...ehJho ■# d 
Chj^ :# di '■# d2 et^ :# d^ 



[if Ch^ e {ci...c„} then 6/^2 else e/tgjho ■# d 
f -.^n^ d Ch^ :# di ... eh„ :# d„ 



di c [Ij, for all 1 ^ i ^ n, d C T 



di E [IJ, d = (i2 u da, d C T 



dE[lJ,diE[lJ,foralll^^^n 
[fork / eh^...eh,Jho ■# W 



di E [IJ 



[join eh^]hQ :# [Ij 

• the dimension function for procedures (written in relational notation) 

with signature F ^ (N x \s^']_), associating a procedure name with the number 
of its parameters and the dimension of its result. 

For each procedure [procedure (/ e/^^] e p we say f has in-dimension 

n and out- dimension d, and we write "f :^ n ^ d" where d is the minimum 
fixpoint such that #(e/ij) = d. 

Then we define #(-) : E —>■ as the total extension of _ so that #(-) returns 
T where - - is not defined and its same result elsewhere. □ 

Definition 4.1 depends on the fact that the relation _ :^ _ be a function, which is 
clearly true because rule premises are pairwise disjoint. 

Notice that all the side constraints of the form "d IZ T" (rules for let, call 
and if) are only included for aesthetic symmetry, so that in case of any dimension 
inconsistency - '■# - remains undefined just as in the other syntactic cases, rather 
than returning T: of course the total extension #(-) would remain the same even if 
we erased such side constraints from - :# -. 

Definition 4.2 We call an expression Ch plural if #(e/i) = [nj for some n \, 
consistently-dimensioned if #(e/i) IZ T and inconsistently-dimensioned if fj^{eh) = 
T. 
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We intentionally wrote Definition 4.2 so that it makes empty bundles plural, and 
trivially-looping expressions not plural. The reason for this choice is bound to the 
implementation: only what we call plural expressions requires some non-conventional 
implementation technique such as placing a value in a number of registers or stack 
slots different from one. Expressions which never return anything do not pose par- 
ticular problems — and we stress again that we do not consider non-termination 
an error; anyway the existence of expressions Ch :# -L is the reason why we resist 
the temptation of defining "singular" expressions; should eh be both singular and 
plural, singular and not plural, plural and not singular, or neither? No solution 
seems intuitive, or particularly useful. 

It is not hard to see how inconsistent dimensioning "propagates outwards", from 
a contained expression out to its containing expression: 

Proposition 4.3 Any expression containing an inconsistently-dimensioned subex- 
pression is inconsistently-dimensioned itself. 

Proof Assuming :^ T, we have to prove that C[e/i] :^ T for all contexts C[_]. 
A straightforward structural induction over contexts: 

• C[eh] = eh (base case): we trivially have that #{C[eh]) = #(e/i) = T; 

• C[eh] = [let xi...Xn be C'[eh] in eh^jho- since by hypothesis #(6^) = T, by 
induction hypothesis we also have ^[C'[eh]) = T $ [n\; this makes impossible 
to satisfy the conditions of the let rule in Definition 4.1; so the relation - :# - 
is undefined on C[e/j], and again by Definition 4.1 we have that #(C[e/i]) = T; 

• C[eh] = [let xi...Xn be Ch^ in C'[eh]]ho- again we have that #(C"[e/i]) = T, 
and the let rule in Definition 4.1 cannot fire because the let body C"[e/j] 
has dimension T; if the rule does not fire then C[eh] = T because _ :^ _ is 
undefined on the parameter, as in the previous case; 

• C[eh] = [call / eh^...eh„C'[eh]eh„+^...eh„+^]ho- again #{C'[eh]) = T by 
induction hypothesis; but then there exist a procedure actual whose dimension 
is not lower than or equal to [Ij, and the call rule in Definition 4.1 cannot 
fire; _ :^ _ is undefined on C[eh], hence #(C[e/i]) = T; 

• C[eh] = [primitive tt ehi...eh„C'[eh]eh„^i:.eh„+^]ho- same reasoning as the 
call case; 

• C[eh] = [if C'[eh] e {ci...c„} then else eh^lho- again #(C'[e?,]) = T by 
induction hypothesis, which means that the dimension of the discriminand is 
not lower than or equal to [Ij, the if rule in Definition 4.1 cannot fire, hence 
_ :^ _ is undefined on C[e/,,], and ^{C[eh]) = T; 

• C[eh] = [if ehi e {ci...c„} then C"[e/j] else e/igj/jf,: similar to the let body 
case: #(C[e/i]) = T, which prevents the if rule in Definition 4.1 from firing; 
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• C[eh] = [if e^-^ e {ci...c„} then e/^ else C"[e/j]]/iQ: same reasoning as the 
previous case; 

• C[eh] = [fork / eh^...eh^C'[eh]eh^^^...eh^^^]ho- same reasoning as the call 
case; 

• (^[e/i] = [join C"[e/i]]/jy: same reasoning as the call case; 

• C[eh] = [bundle ehi...eh„C'[eh]eh„_^^...eh„+^]ho- same reasoning as the call 
case. ■ 

It is also intuitive that replacing a subexpression with another whose dimension 
is lower or equal will not raise the dimension of the containing expression, which 
makes #(_) a monotonic function: 

Proposition 4.4 (:^-monotonicity) Replacing a subexpression with another of 

lower or equal dimension cannot raise the dimension of the containing expression. 
More formally, for any expression context C[-], expression e and expression e' , if 
we have that #(e) ^ #(e') then we also have that #(C[e]) ^ #(C[e']). 

Proof Another straightforward structural induction over contexts. ■ 

The following definition identifies programs where no expression of dimension T 
occurs anywhere. As the reader will have anticipated, we are going to prove the 
condition sufficient to guarantee a desirable property with respect to the dynamic 
semantics. 

Definition 4.5 (well-dimensioned) We call the static program p well-dimensioned 

if both the following conditions hold: 

• for all procedure definitions [procedure (/ xi-.-Xn) e/jj e p such that f 
n —>■ d, we have d IZ T; 

• for the main expression e/^ e p we have that #(6/12) C T. 

We call ill-dimensioned all programs which are not well-dimensioned. □ 

4.1.2.1 There cannot be a most precise dimension analysis 

It would be nice to be able to characterize our definition of an expression dimension 
as the "most precise", but unfortunately our definition is not the best one, and in 
fact no such definition can exist. 

In order to see why at least at an intuitive level we consider the program: 

[procedure (.loop ) [call loop ]/j J e p 

[if e {M{2)} then AA(42)^^ else [call loop ]h,]h2 e P 

It is obvious that the main expression loops, but the analysis assigns the main 
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expression tlie dimension [Ij instead of _L: hence our definition of dimension does 
not correspond to "the best possible" analysis, as it is possible to change it to account 
for more particular cases, yielding a more precise result: in fact we can always im- 
prove the analysis by recognizing particular patterns in programs — for example by 
simplifying statically-determined conditionals at compile time as a first refinement; 
but because of the Halting Theorem we cannot hope to cover all possible cases. 

This trivial fact prevents us from finding a result similar to the Most-General- 
Type theorem in [22]. 

4.2 Semantic soundness 

Before proving the result connecting dimension analysis with Eq's dynamic semantics 
we need to define some machinery. 

4.2.1 Resynthesization 

The idea of resynthesization consists in taking any reachable configuration x 
reconstructing from it an expression e that, if evaluated at the top level in the state 
of Xi would yield the same result and the same effects as x- Actually we do not 
need to specify this equivalence any further, and in fact we will not prove any result 
such as reduction-equivalence on resynthesization, since our use of it here is very 
well-delimited, due to the technical need of assigning a dimension to all reachable 
eo configurations. 

We can easily view the content of any value stack y in a reachable configuration as 
a list of non-holed expressions Ey, by remarking the intuitive role of 'T as a bundle 
delimiter; bundles within V can be represented in Ey as explicit bundle expres- 
sions^. For the purposes of resynthesization it is also safe to ignore "|" delimiters, 
since the particular arity mismatches they were conceived to prevent (see §2.5.1) 
cannot occur in static programs, our only programs of interest in this chapter. 
More formally, we define the translation as follows: 

E}ci...c„iV — [bundle ci...Cn]h' -E^v with some fresh handle h' . 

For example V = ll 2l3l would be transformed into Ey = {[bundle l;^/^2/j^]^/ , [bundle 3, 
with fresh handles /ig, h\, /ij, h'^ and h'^. 

We define resynthesization as a relation r, written in functional notation as r(_ _); 

■^Actually we would need to introduce explicit bundle expressions only for plural bundles in V; 
the definition given below avoids this complication at the price of producing some trivial bundle 
expressions with only one item. 
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for readability's sake we omit the comma between the two parameters of r, since 
both tend to be syntactically complex. 

Given a stack and a list of non-holed expressions as obtained from the translation 
above, resynthesization produces a non-holed expression list: 

Definition 4.6 (resynthesization) We define the resynthesization relation r(_ _) 
as follows, with the convention that all the prime- decorated handles only appearing 
on the right sides be fresh: 

• r«> E) = E 

• r{{eh, p)-S E) = r{S Ch-E), for any non-holed Ch; 

• r(([let xi...Xn be □ in Ch^jho, p)-S Ca-E) 
= r(([let xi...Xn be Ca in Ch^jh'g, p)-S E) 

• r(([call / □]/!(), p).S ea„ea„_i---ea2eai- E) 
= r(([call / ea^...ea„\'o, p).S E) 

• r(([primitive vr nJ/iQ, p).S ea„ea^_^...ea2eai-E) 
= r(([priniitive vr eai...ea„]/i^, p).S E) 

• r(([if □ e {ci...c„} then else eh.^]ho, p)-S Ca-E) 
= r(([if Ca e {ci...Cn} then else ehs]^'^, p).S E) 

• r(([bundle njho? P)-S Gan&an-\---^a2^a\- E) 

= r(([bundle ea^...eaJh'o^ p)-S E) 

• '"(([fork / □]/!(), p).S ea„ea„_i---ea2eai- E) 
= r(([fork / ea^...ea^]h>^, p).S E) 

• ''(([join p).S Ca-E) 

= r(([join ea]h'g, p).S E) □ 

It is obvious from Definition 4.6 that resynthesization is deterministic up to 
handle identity, and the same can be said about the value stack conversion defined 
above. Since the specific choice of handles is immaterial with respect to dimension, 
and accounting for the specific choice of handles would make resynthesization much 
harder to work with without any particular benefit, from now on we will commit a 
slight abuse of language and speak about resynthesization as a function. 

In the following we are also going to need a couple of simple properties of resynthe- 
sization: 

Lemma 4.7 ("r does not delete expressions") If r{S E) = E' for some S,E 
and E' , then all expressions occurring in E also occur somewhere in E' . 
More formally, for all e, if r{S Ei.C\e\.E2) = E' , then there exist E[, C'[^, E2 such 
that E' = E{.C'[e].E'2. 
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Proof By induction on tlie number of recursive calls to r. ■ 

Lemma 4.8 (resynthesization shape-independence) The shape of the expres- 
sions contained in E does not affect the result of r{S E). 

More formally, for all expression sequences Ei, E2, E[, E2, contexts C[_],C[_] and 
expression e, if we have that r{S Ei.C\e\.E2) = E[.C'[e].E2 then we also have that 
r{S Ei.C[e'].E2) = E[.C'[e'].E2 for any other expression e' . 

Proof By induction on the number of recursive calls to r. ■ 



4.2.2 Weak dimension preservation 

Resynthesization allows us to gloss over the difference between an Eq expression and 
any configuration reached by evaluating an Eq expression, so that we may speak 
about the dimension of either; so, in order to further simplify our presentation, we 
extend Definition 4.1 by assigning a dimension also to reachable configurations: a 
reachable configuration will have the dimension of its resynthesization; or, slightly 
more formally, ii x = {S V T) is a reachable configuration, then we write to 
mean "^^(e), where r[S Ey) = (e)". 

It is not yet clear at this point why r is always defined and always yields a 
singleton expression sequence on reachable configurations; we defer the proof to 
Corollary 4.10. 

The Weak Dimension Preservation property, below, is the central result bridging 
eo's dynamic semantics to dimension analysis by showing that "evaluation preserves 
dimension"; in some circles such properties are known as "subject reductions" — or 
more intuitively in French as auto -reductions. 



X 



X 



d 



d' 



Figure 4.2: The Weak Dimension Preservation property (Lemma 4.9): when a con- 
figuration X of dimension d can reduce to another configuration x' of dimension d' 
we have that d' E d. 



The property is "weak" in the sense that an expression is allowed to reduce to 
another expression of lower dimension: as it can be seen from the proof, this may 
happen with conditionals: choosing one branch or the other entails replacing the 
expression on the top of the stack by a subexpression of it, hence by an expression 
with fewer dimension constraints. In particular, it is possible that an inconsistently- 
dimensioned expression reduces to a consistently-dimensioned one ("from T to d IZ 
T"); anyway the vice- versa ("from d IZ T to T") cannot happen, which is all we need 
for our soundness property. 
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Lemma 4.9 (Weak Dimension Preservation) Let reachable configurations x, x' 
be given, such that x = {S^ F^) — >e x' = i^x' ^x' ^x')' ^ow, ifr{Sx Ey^) = (e) 
then there exists e' such that r{Sx' Ey^,) = {e') and #(e') E #(e). 

Proof Induction is not needed: we just directly prove that, for all cases in which 
{Sx Vx Tx) = X^EX' = iSx' Vx> r^'), we have that #(r(5^, Ey^,)) E #(r(5^ Ey^)). 
We avoid writing handles for converted value stack expressions, since they are im- 
material anyway; in this proof we also freely abuse the notation by saying that r 
returns results which are "equal to" something else, writing "=" without explicitly 
stating that the equality is up to handle choice. 

• \constant\ 

{ch, p).S IV T S IclV T: 

r{Sx Ey^) = {definition of r} r{S c.Ey^) = {substitution} ^(5*^/ Ey ,)] hence 
in this case we have that ^{r{Sx' ^V^i)) = 7^(^('S'x Ey^))] 

• {variable] 

{xh, p).S iVr^ESlclV F: 

looking at Xi we have r{Sx Ey^) = {by definition of r} r{S x.Ey^) = {hy- 
pothesis} {e) = {Lemma 4.7, for some C[_]} {C[2;]); 

looking at x'l we have r(5^/ Ey ,) = {substitution} r[S c.Ey^) = {Lemma 4.8} 
{C[c]), which we call {e'); since #(c) = [Ij E #(2;) = [Ij, by Proposition 4.4 
we have that #(e') E #(e); 

• [lete] 

([let xi...Xn be en^ in e/ij^g, p).S IV T — >e (e/^, /o)-([let xi...Xn be □ in 
e/izJ/io' P)-S IV T: 

looking at x we have that r{Sx Ey^) = {definition of r} 

r(5 [let xi...Xn be in eh2\ho-Ey^)\ looking at x! we have that r^Sx' Ey ,) = 
{definition of r} r(([let xi...Xn be □ in eh2\ho-, p)-S eh^.Ey^) = {definition of 
r} r(([let xi...x„ be e/jj in Ch^y^, p).S Ey^) = {substitution} 
r{S [let xi...Xn be e^^ in eh^lh'^.EyJ = {substitution} r{Sx EyJ = {hy- 
pothesis} {e): again we have that r{Sx Ey^) and r(Sx' Ey^,) are equal to the 
same singleton sequence (e) = (e'), hence #(e') = #(e); this proof case is 
essentially identical to the proof cases of the other expansive rules; 

• [letj 

([let xi...Xn be □ in Ch^jho, P)-S lCmCm-i---C2CilV T — >e {ch^, p[xi ^ 

Ci,X2 ^ C2, ...,Xn Cn])-S iV T: 

looking at x we find that r{Sx Ey^) = {definition of r, twice} 
r{S [let xi...x„ be [bundle ci...Cm]h[ in Ch^lk'^-Ey^,) = {hypothesis} <e> = 
{Lemma 4.7, for some context C[_]} (C[[let xi...Xn be [bundle ci...Cm]h[ in 
looking at x' we have that r(S'^/ Ey , ) = {definition of r, twice} r{S e/ij .Ey , ) = 
{Lemma 4.8} (C[e/i2])) which we call {e'); since by Definition 4.1 a let ex- 
pression has the same dimension as its body, it follows that #(e') E #(e) by 
Proposition 4.4; 
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• [calle] 

([call / eh^...ehJ[ho, P)-S IV T — >e {^h^, p)---(e/,„, p).([call / □J/.g, 0).S 
IXV T: 

identical to the other expansive rule cases; 

• [callc] 

([call / p).S lCnlCn-ll...lC2lCilXV T >E (cft, p[xi ^ Ci,X2 ^ C2, 

Xn-1 Cn-l,Xn ^ Cn]).S iV T: 

by the rule side condition / takes exactly n parameters and has dimension 
n ^ d for some d. Looking at x find that r{Sy. Ey_^) = {definition of r} r(( 
[call / ci...Cn]h' ) p)-S Ey ,) = {hypothesis and Lemma 4.7, for some context 
C[_]} <C[[call / ci...c„],^]>. 

Starting at x', f{S-^i Ey ,) = {substitution} r{{eh,p)-S Ey ,) = {Lemma 4.8} 
(C[e/i]). But Ch and [call / ci...Cn]h' have the same dimension d by Defi- 
nition 4.1, hence by Proposition 4.4 we have that #(C[e/i]) E #(C[[call / 
ci-Cn]h'jy, 

• [primitiveg] 

([primitive vr eh^...ehJho, p)-S iV T — >e 

(e^i, p)...{eh„, /)).( [primitive tt 0).5' T: 

identical to the other expansive rule cases; 

• [primitive^] 

([primitive vr n]^^, p).S ICnlCn-il ■ .102101111^ T — >e S lc'^c'^_^...C2c[lV T', 
when rprimitives(7r)(ci, ...,c„,r) = <c^, ...,c^,r'>: 
since the rule side condition applies, we have that tt :^ n — > m; 
r(S'^ Ey^) = {definition, twice} r(5 {[primitive vr ci...Cn]h'}-Ev^) = {hy- 
pothesis, for some context C[_]} {C[[primitive vr ci...c„]/j/^]); 
r{S^i Ey i) = {substitution} r[S [bundle c!i...c[^h' -Ey^) = {Lemma 4.8} 
{C[[bundle Cp..c^]/j/]); since by Definition 4.1 [bundle di-.-d^y and 
[primit ive TT c\...c^i^' have the same dimension, we conclude by Proposi- 
tion 4.4; 

• [ife] 

([if ey,^ e {ci...Cn} then else Ch^jho, p)-S iV F — >e (e/,,, p).([if □ e 
{ci...c„} then en^ else eh^lho, p)-S iV T: 
identical to the other expansive rule cases; 

([if □ e {ci...c„} then en^ else eh^jho, p)-S IdV T — >e {en^, p)-S iV T, 
when c e {ci...c„}: 

r{S^ Ey^) = {definition, twice} r{S [if ce {ci...c„} then e/ij else eh^]fi' . 
Ey^) = {hypothesis and Lemma 4.7, for some context C[_]} 
<C[[if ce {ci...c„} then else e/^g];,;^]); 

r(S'^/ Ey^,) = {definition} r{S Ch^-EyJ = {Lemma 4.8} (Cleh^]); 
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by Definition 4.1 we have that ^(e/jj) E #([if c E {ci...Cn} then e/jj else 
^^sI/iq)' ^^'^ conclude by Proposition 4.4; 

([if □ e {ci...c„} then e^^ else e^gj/ig, /9).5 ^c^F T — >e {eh^, p).S iV T, 

when c ^ {ci...c„}: 

identical to the previous case; 

• [bundleg] 

([bundle eh^...ehJho, P)-S IV T — >e (e/j^, p)...{eh„, p). ([bundle □]/,(,, 0).S 
IXV T: 

identical to the other expansive rule cases; 

• [bundlej 

([bundle nj^o, p).Sl ic2iciitv r >E s I C2C1IV T: 

r{S^ Ey^) = {definition, twice} r{S [bundle ci...Cr^hi .E) = {substitution} 
r(5^/ Ey^i)^ and again we have that e = e'; 

• [forke] 

([fork/e/,,...eO/.o' p)-SlVT~-^E [eh,, /o)...(e;,„, /j).([fork / 0).SnVT: 
identical to the other expansive rule cases; 

• [forkc] 

([fork / p).5^c„^c„_l^..^c2^Cl^tF^ 5;r(t);y r[*7,«;- PN^rw,..^c„...,.„^c„]) 

r{S^ Ey^) = {definition, twice} r(5 [fork / ci...Cn]h' -E) = {hypothesis, for 
some context C[_]} {C[[fork / ci...Cn]/i' ]); 

r(5^/ Ey^,) = {definition} r{S T{t).E) = {Lemma 4.8} {C[T{t)]). Since 
by Definition 4.1 we have that i^{T{t)) = [Ij E #([fork / ci...Cn]^/^), we 
conclude with Proposition 4.4; 

• [join J 

([join Ch^ho, P)-S IV r — >E {ch^, /)).([join aj/j,,, p).S iV T: 
identical to the other expansive rule cases; 

([join Dj^o, p).S lT{t)lV r S IctlV r, when Pf^tures : t ^ «>, Ict): 

r{Sy. Ey^) = {definition, twice} r{S [join 7~(t)]/i^.i?) = {hypothesis and 

Lemma 4.7, for some context C[_]} (C[[join T(t)]/i^]); 

r{S^i Ey^i) = {substitution} r{S Ct-E) = {Lemma 4.8} {C[ct]); 

since by Definition 4.1 we have that i^{ct) = [Ij E ^^([join ^(t)]/!^^), we 

conclude by Proposition 4.4; 

• [||] 

SVT-^eSV T't^i!i,Pl when Pf^^ures : t ^ {St, Vt) and St Vt P — e 
S't VI P': 

here r(5i^ ^V^) = ^{Sx' Ey ,), hence e' exists and is equal to e. ■ 
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The following trivial consequence of Lemma 4.9 allows us to think of r as always 
returning a single non-holed expression, when applied on a stack and a value stack 
(re-encoded as a list of non-holed expressions) from a reachable configuration: 

Corollary 4.10 Reachable configurations resynthesize into exactly one expression. 

Proof The property is obvious for initial configurations, which make up the in- 
duction base; Lemma 4.9 proves the inductive case. ■ 

4.2.3 Semantic soundness properties 

In the style of the Semantic Soundness Theorem of [55, §3.7], we can now finally 
prove that "well-dimensioned programs do not go wrong": 

Theorem 4.11 (Dimension Semantic Soundness) No consistently- dimensioned 
expression fails because of dimension: more formally, for all eh andV, if i^{eh) C T 
then for each x such that {{eh, 0) I T) — >^ x cannot have that x — *E 

Proof By contradiction, let us assume that a reachable consistently-dimensioned 
expression fails because of dimension: then we have that #{eh) C T and {{ch, 0) I T) 
— >^ X — >E but because of Lemma 4.9 each reduction starting from the initial 
configuration either leaves the dimension unchanged or lowers it, hence #(x) — #(( 
Ch, 0) 0) 1^ T, which means that r{x) is also consistently-dimensioned. 
We examine all possible cases where x — *E '*^#: 

• ([let xi...Xn be □ in e/ijjfto' P)-^ ^ ^ — *E when the top bundle on V 
has less than n elements, let us say c'^^.-.c^, with k < n: then by Definition 4.6 
applied twice and Lemma 4.7 for some context C[_] we have that r{x) = C[ 
[let xi...Xn be Cp..c^ in e/i2]/io]' t>ut then by Definition 4.1 the let expression 
cannot be consistently-dimensioned, and neither can r{x) by Proposition 4.3: 
contradiction; 

• ([call / nj/io, p).S V r — >£ when the top frame on the value stack 
has a wrong number of ^separated constants: then by Definition 4.6 and 
Definition 4.1 we have that r{x) = T: contradiction; 

• ([primitive vr □]/!(,, p).S V T — >e when it -.^ n ^ m, V ^ ICnlCn-il.- 
lC2lciltV': 

identical to the [call / Dj/jg case; 

• ([if □ e {ci...c„} then Ch^ else ehjhg, p)-S V T — >e when V ^ IdV: 
by Definition 4.6 and Definition 4.1 we find immediately that r{x) = T: con- 
tradiction; 

• ([bundle p).S V T — >e ^# when V # lcilC2l..lCn-ilCnW' : 

similar to the [call / nj/io case: since x is reachable the original bundle 
expression contained exactly n parameters, but some of them are plural: but 
by Definition 4.6 and Definition 4.1 we have r{x) = T: contradiction; 
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• ([fork / nJ/iQ, p).S V r — >£ when the top frame on the value stack has 
a wrong number of ^separated constants: identical to the [call / nJ/iQ case; 



Corollary 4.12, a simple consequence of Theorem 4.11, extends the semantic 
soundness result to whole programs by providing a sufficient condition for avoiding 
dimension errors. 

Corollary 4.12 Well-dimensioned-programs cannot fail because of dimension. □ 

Of course well-dimensioning is only a sufficient condition for the absence of di- 
mension failures: an expression containing unreachable code such as [if J\f{l)f^_^ e 
{AA(2)} then [bundle ^h4,]h2 else 5h^]fiQ may have dimension T, without ever 
failing because of dimension. 

4.3 Reminder: why we accept ill-dimensioned programs 

Even when only speaking about static programs, we prefer not to restrict ourselves 
to well-dimensioned programs for reasons of philosophical coherency (§1.4), despite 
the difficulty of finding believable examples of ill-dimensioned programs that we 
would like to accept as "correct". 

We could argue that it is at least conceivable that syntactic extensions automatically 
produce static but ill-dimensioned Eq programs — which maybe could be proved not 
to fail because of dimension, due to some property enjoyed by the extension. On 
the other hand the extension might actually be unsafe in the general case, and still 
useful. 

But independently of any extension, at the level of the core language it is rea- 
sonable to accept any program which could possibly yield a useful output: since we 
want to respect the programmers' intelligence eq will not constrain the expressiv- 
ity of the upper layers; therefore we want to accept ill-dimensioned programs and 
run them until an error condition is reached, if ever. The compiler should generate 
code which runs until possible, and compilation itself should not fail because of 
ill-dimensioning. 

Of course a personality implementer is always free to add static checks generating 
warning messages or even fatal errors at compile time, yielding a very safe — if 
restrictive — language. Such languages do have a place in the world, as shown 
by the experience of Ada, ML and Haskell; anyway we still hold that refusing to 
proceed at any cost in a hysterical paralysis is not the most useful reaction to the 
discovery that a program might, or even will, fail. 



• ([join Dj/io, p).S V T — >E ^# when V ^ IdV: 
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4.4 Summary 

We have shown a static semantics of permitting to staticahy infer the dimension 
of bundles each expression in a static program may evaluate to. We have then 
proceeded to prove our static analysis to be sound with respect to £o's dynamic 
semantics, providing a sufficient condition which guarantees certain failures not to 
happen at run time. 

Such formal work is practical and not overly complicated, thanks to the mini- 
malistic nature of Eq. 

Dimension analysis can be used for rejecting programs not respecting the sufficient 
condition; anyway we advocate against such practice, in the interest of extensibility. 



Chapter 5 

Syntactic extension 



The core language specified in §2 is useful but inconvenient for humans to write 
directly. In this chapter we are going to specify syntactic abstraction mechanisms 
allowing users to easily extend the language by adding high-level syntactic forms to 
be automatically rewritten into eo- 

Since the extension facility is defined in eo itself and tightly intertwined with 
the problem of expressing language syntax as a data structure, we also need to deal 



with a bootstrapping problem in the process. 
Contents 
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Despite some fundamental differences, the syntactic layer of e is strongly inspired 
by Lisp and indeed adopts many conventions taken from Scheme and Common 
Lisp. We are now proceeding to quickly review Lisp dialects, in order to establish a 
coherent foundation for our critique. 

5 . 1 Preliminaries 

Lisp is a family of dynamically-typed higher-order call-by-value imperative program- 
ming languages, suitable to be used in a functional style and particularly convenient 
for symbolic processing. 

The original "LISP" language described by John McCarthy back in 1959^ [51] 
has been extended and independently re-implemented many times throughout the 
years giving birth to a wealth of dialects, the most important being the large and 
complex Common Lisp [4] and the elegant, minimalistic Scheme [89, 41, 79]. All 
dialects share the same core ideas. 

Contrary to persistent misinformation, most modern Lisps are statically scoped 
(^^ lexically scoped" in Lisp jargon); Scheme and Common Lisp in particular have 

^McCarthy specified in a 1995 footnote that [51], pubfished in April 1960, "was written in early 
1959": see footnote 4 at page 16 in http://www-fornial.stanford.edu/jmc/recursive.pdf. 
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been using static scoping since their original inception in 1975 and 1984. 

Lisp introduced several striking innovations most of which eventually found their 
way into the mainstream, including the interactive Rcad-Eval-Print Loop, condi- 
tional expressions, higher order and garbage collection. Recursion has been sup- 
ported since the very beginning, and in the 1960s the possibiHty of expressing a 
program as a collection of recursive procedures might have felt like the most radical 
feature. But what still sets apart Lisps from the other languages after fifty years 
is their homoiconicity: programs are encoded using the same data structure they 
manipulate, which is in fact the only existing "data type" in the language; such data 
structure, the s-expression, is simple and convenient for meta-programming and for 
representing symbolic information in general. 

Just to be explicit from the beginning and to prevent misunderstandings, we al- 
ready make it clear that e's syntax will use s-expressions but will not be homoiconic: 
£ is not a Lisp. Yet we find it best to illustrate our solution in an incremental way, 
starting from a description and critique of Lisp and then re-tracking the mind path 
by which we arrived at our design. 

In the following we give our definition of s-expressions and then proceed to quickly 
review the main ideas of Lisp, without exactly following any particular dialect. Our 
lexical and syntactic conventions will mostly come from Scheme, but our macro 
system will be closer to Common Lisp's. 

Our meia-linguistic conventions by contrast will be non-standard, particularly 
with regard to s-expressions, in order to establish a new common framework en- 
compassing £ as well. Experienced Lisp users interested in a comparison with the 
traditional jargon are referred to the footnotes for some discussion of the rationale 
for our changes. 

5.2 S-expressions 

The s-expression is an inductive data structure: it can be seen a disjoint union 
containing at least several fixed atomic types and an s-cons type (pronounced "ess- 
cons"); an s-cons is an ordered pair of s-expressions^. 

The specific collection of atoms depends on the Lisp dialect, but at least some 
types are always provided: a unique object called the empty list, fixnums, and 
symbols. Symbols are objects identified by a unique name, which can be compared 
for equality with one another. All dialects also allow procedures (zero or more s- 
expressions as parameters, one s-expression as result, possibly with side effects) as 
s-expressions. 

All practical Lisp dialects also support other atom types, including booleans and 
other numeric types; other non-atomic types such as vectors are available as well, 
despite not being required for our presentation. 

^S-conses are called "conses" in Common Lisp and "pairs" in Scheme. 
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It is worth to stress the dis joint-union nature of s-expressions; however in this 
shghtly non-standard presentation we prefer to exphcitly specify an encoding for 
an s-expression as a pair made of a natural type identifier and an element of the 
corresponding type. 

The following "open-ended" definition is slightly involved due to the nature of 
s-expressions as a disjoint union whose cases, despite not being all specified, are 
potentially recursive^: 

Definition 5.1 (s-expression) Let Ao,...,A„_i be an ordered collection of sets 
called addend types including at least the set of fixnums, the set of symbols, the 
empty list singleton, the set of conses, and some set of s-expressions-to-s-expression 
procedures. 

We define the set of s-expressions Sao,...,a„_i (or simply S without suffixes when 
the addends are clear from the context) as ({0} xAo)u({l} x Ai)u...u({n— 1} xA„_i). 
For each addend type Aj we also define: 

• the injected type s-Aj (pronounced like Aj preceded by the syllable "ess") as 
{i} X Ai; 

• the Ai-injection function in^- : Aj ^ S as {x ^ {i,x) \ x e Aj}; 

• the Ai-ejection partial function ejA^ : S ^ Aj as {[i,x) x \ x e Ai}. 

And the untyped ejection function ej : S Uj'^« '^^ {(^i^) x \ {i,x) e S}. □ 

As per Definition 5.1 we call s-Bxnums, s-symbols and the empty s-list singleton 
the injections of fixnums, symbols, and the empty list singleton into s-expressions. 
We call s-conses the injection of conses — which, it is worth to stress once more, 
means the set of s-expression pairs, rather than any pairs. The specific nature of 
S-procedures depends on the Lisp dialect, but in general we can think of them as 
the injection of procedures with effects accepting zero or more s-expressions and 
returning one s-expression^. 

Up to this point we have defined s-expressions as a mathematical structure; but 
since s-expressions are used for input and output, we also need to specify their writ- 
ten notation as a reasonably formal syntax. However, to avoid making our notation 
too heavy, we will not explicitly distinguish between s-expression literals and their 
corresponding non-injected literals. 

■^The s-cons is not necessarily the only recursive case. We have already hinted at the "vectors" 
{s-vectors for us) supported by all practical Lisps, whose elements are other arbitrary s-expressions; 
but the idea of course is to enable the user to provide more recursive addends herself. 

*In our presentation we use the "s-" prefix for explicitly highlighting the difference between a 
type and its injection into s-expressions, but such distinction is not needed in Lisp where every 
object is an s-expression: s-fixnums, s-symbols and the empty s-list in Lisp are just "fixnums", 
"symbols" and "the empty list". Here we speak of an s-cons as an (injected) pair of two s-expressions 
rather than two arbitrary objects; since here we do not have much use for non-injected conses we 
can also avoid the issue of conses of non-s-expressions. 
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Definition 5.2 (s-expression syntax) Comments start with a semicolon and ex- 
tend up to the end of the line; all whitespace is otherwise ignored. 

• We write s-fixnums as strings of one or more digits in radix 10, preceded by 
an optional sign; 

• we write the empty s-list as "() " possibly with whitespace or comments between 
the open and closed parenthesis; 

• we accept as an s-symbol any sequence of characters not containing spaces, 
dots, semicolons, quotes, backquotes, commas or parentheses which is not 
well-formed as an s-fixnum or an s-expression prefix as per Syntactic Con- 
vention 5.6^; 

• if si and S2 are s-expressions, then we write their s-cons as "(si . S2) "■ ^ 

It should be noticed that s-procedures have no syntax in Definition 5.2: this means 
that they can not be directly expressed as literal constants. 

Some sample s-fixnums are 1234, 0, +12, and -42; () is the empty s-list; all of 
the following are s-symbols: a, b, +, this-is-an-s-symbol, incr! , even?, pi/4, 1-. 
Some s-conses are (1 . 2), (a . ()), ((a . 3) . 1), ((a . b) . (57 . d)). 

Since s-conses allow s-expressions to be nested at any depth, it is convenient to 
unambiguously name specific substructures: 

Definition 5.3 (s-cons selectors) Let si and S2 be s-expressions; we say that the 
s-car of (si . S2) is si and the s-cdr of (si . S2) is S2- 

By definition, let s-car and s-cdr be s-cons selectors; now, if s-cPr is an s-cons 
selector for some "path" P e {a, d}^ , we define the s-cons selector s-caPr of s to be 
the s-car of the s-cPr of s, and the s-cons selector s-cdPr of s to be the s-cdr of the 
s-cPr of s^. When pronounced, each "a" and "d" in s-cons selector names belongs 
to a different syllable. □ 

So for example, the s-caddr (pronounced "ess-ca-dh-dr") of an s-expression is the left 
element of the right element of its right element, and the s-caddr of (a . (b . (c 
. (d . e)))) is c. 

Apart for their "s-" prefix, introduced by us to distinguish s-expression from 
addends, s-cons selector names are well-established in Lisp. They trace back their 
alien-sounding names to details of the IBM 704, the machine on which McCarthy's 
LISP was originally implemented [51]; we retain the names for their easy compos- 
ability, and as a homage to Lisp culture. 

Since s-conses may be nested arbitrarily, they can encode linear sequences of any 
length. Such sequences are conventionally nested on the right: 

^This description is simplified and idealized compared to what any realistic Lisp allows. Most 
dialects provide some escaping mechanism to embed any character within a symbol name, including 
whitespace. 

^Again, we prepended the "s-" prefix to the traditional cons selector names. 
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Definition 5.4 (s-list) We call an s-expression an s-list'^ (pronounced "ess-list") if 
it is either the empty s-list or an s-cons whose s-cdr is an s-list. 

If an s-list s is empty then we say it has no elements; otherwise we call its 
elements the s-car of s followed by the elements of the s-cdr of s. □ 

The following three s-expressions are s-lists: (),(a . ()),(a . ((1 . 2) . ())); 
the following three s-expressions are not s-lists: foo, (a . b), (a . (b . c)). 

It may be worth stressing that an s-list is allowed to have other s-conses as some 
or all of its elements, which are not restricted by homogeneity or indeed any con- 
straint on their shape. 

S-cons syntax becomes clumsy to use when s-expressions are nested too deeply, 
hence the need for the following syntactic convention: 

Syntactic Convention 5.5 (compact s-expression notation) An s-cons whose 
s-cdr is either another s-cons or the empty s-list may optionally be written by both: 

• omitting the dot; 

• omitting the parentheses around the s-cdr. □ 

For example the last two sample s-lists above may also be written as (a) and (a (1 
. 2)); (a b . c) is another way of writing the last sample non-s-list above. 

Syntactic Convention 5.5 always applies to the "spine" of s-lists, making them 
more convenient to write than alternative specular structures nested on the left. 

It is easy to convince oneself that, even with Syntactic Convention 5.5, s-expression 
notation remains non-ambiguous; in particular we do not need any precedence or 
associativity rule to parse an s-expression in written form, nor grouping brackets 
— Far from it, parentheses are a fundamental part of the syntax, and can never be 
added or removed without changing the denoted data structure. 

The following shorthand syntax for s-expressions will be useful later: 

Syntactic Convention 5.6 (Lisp s-expression prefixes) For any s-expression 
s: 

• (quote s) may optionally be written as 's; 

• (quasiquote s) may optionally be written as ' s; 

• (unquote s) may optionally be written as ,s; 

• (unquote -splicing s) may optionally be written as 

We say that " " " and are s-expression prefixes. □ 

^What we call s-list here is traditionally known as "list" or "proper list" in Lisp. What we would 
refer to here as a non-s-list s-cons is known in Lisp as a "dotted list" (since Syntactic Convention 5.5 
does not apply); somewhat confusingly, Lisp "dotted lists" are not "lists". 
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s- expression ::= 
atom 

I ( s- expression rest 
I prefix s- expression 



{ atom } 

{ s-coiis(s- expression, rest) } 

{ s-cons(lookup(pre/ia;), s-cons{s- expression, ())) } 



rest :: = 
) 




I . s-expression ) 
I s-expression rest 



Figure 5.1: S-expressions can be parsed with tlie attributed LL(1) grammar [3, 
§§4-5] above, also supporting Syntactic Conventions 5.5 and 5.6. The grammar is 
simple enough to allow for a hand-coded recursive-descent parser, with no need for 
generators. 

Replacing the second alternative for s-expression with "| ( rest { rest }" yields an 
even simpler grammar which recognizes ( . s) as an alternate degenerate form of 
any s-expression s, as in fact several Lisp implementations do. 

5.3 Lisp syntax 

Up to this point we have described the syntax of the s-expression language, without 
providing any corresponding semantics other than the disjoint-union data structure; 
even Syntactic Convention 5.6 simply describes a more compact way of writing down 
some inductive data structures, with no meaning deeper their shape. But of course 
the entire point of studying s-expressions is encoding programming language syntax 
into them; the "s-" prefix indeed stands for "symbolic" in [51], and s-expressions 
make up the syntax of (a superset of) Lisp forms. 

5.3.1 Lisp informal syntax 

Here we resist the temptation of formally specifying a mapping from s-expressions 
to terms of a call-by-value A-calculus with conditionals and literals; such a defi- 
nition would depend on the Lisp dialect details and would be either idealized or 
overcomplicated, without adding much to comprehension in any case. 

The following high-level description and the examples below will suffice to pro- 
vide an intuitive idea: 

Syntactic Convention 5.7 (Lisp informal syntax) Let si, S2 and S3 he 

s-expressions. 

• An s-symhol represents a variable with the same name as its ejection; 

• an s-fixnum or empty s-list is self- evaluating, which is to say represents itself 
as a literal constant^; 



*In Scheme the "empty hst" () is not considered a valid expression nor interpreted as a literal 
constant, which forces the user to needlessly quote literal objects. We consider this an unfortu- 
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• an s-cons whose s-car is an s-symhol in a specific set and whose s-cdr has the 
right shape represents the corresponding syntactic form: 

— (if S\ S2 S3) represents a conditional, of which si represents the con- 
dition, S2 the "then" branch and S3 the "else" branch; 

— (lambda si . S2) represents an anonymous procedure with the parame- 
ter names encoded by si; S2 represents the sequence of forms in the body; 
if si is an s-list of s-symbols, then the parameters have the same names 
of its elements^ ; 

— (quote si) represents si as a literal constant; 

— (quasiquote si) represents si as a quasiquoted "mostly-literal" struc- 
ture: the result is a literal structure equal to si except for substructures 
of the form (unquote s) or (unquote-splicing s), which represent or- 
dinary non-literal expressions: 

* for an (unquote s) substructure, the result of evaluating s will re- 
place the substructure in the quasiquoted structure; 

* for an (unquote-splicing s) substructure the result of evaluating 
s, which must yield an s-list, will be spliced element by element within 
the containing s-list in the quasiquoted structure; 

— (define si S2) when si is an s- symbol represents a global definition; S2 
represents the expression to be evaluated and whose result will be named; 

— an s-cons whose s-car is an s- symbol whose ejection is a macro name 
represents a user-defined syntactic form; 

• an s-list whose s-car is not a syntactic form name represents a procedure ap- 
plication: the s-car of the cons represents the operator, and the s-cdr contains 
an s-list with the zero or more operands. □ 

So, for example: 

• 57 represents the literal constant 57 (an s-fixnum); 

• a represents the variable named a; 

• 'a represents the literal constant a (an s-symbol); 

• (a) represents the application of a procedme named "a", with zero arguments; 

• ( (a 43) ) represents the application of a procedure named "a" with one argu- 
ment, the literal constant 43; the result, presumably another procedure, is in 
its turn applied with zero arguments; 

nate design mistake (within an otherwise quite beautiful construction), from which to intentionally 
deviate. 

^It is not worth the trouble to introduce variadic procedures here, but this wording permits us 
to at least not arbitrarily exclude them. 
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• (if (a b) c d) represents a two-way conditional expression; if tlie result of 
the application of the procedure named "a" to the value of the variable named 
"b" is true, then the result is the value of the variable "c"; otherwise the result 
is the value of the variable "d"; 

• (+12) represents the application of a procedure named "+" to two argu- 
ments, the literal s-fixnums 1 and 2; no special syntax is needed or available 
for arithmetic operators, which are considered ordinary procedures: as a con- 
sequence of Syntactic Convention 5.7 procedure application syntax is rigidly 
prefix; 

• '(+12) represents the literal constant (+ 1 2), which is an s-cons and in 
particular an s-list, and also happens to be a valid Lisp expression itself; 

• ' (if ) represents the literal constant (if), an ordinary data structure which 
would not be valid as a Lisp expression; 

• '(if) also represents the literal constant (if); 

• ' (a ,b c) represents an s-list of three elements: the s-symbol a, the value of 
the variable b, and the s-symbol c; 

• ' (a ,@b c) represents an s-list of two or more elements: the literal s-symbol 
a, all the elements of the s-list which is the value of the variable b (assumed 
to be an s-list), and finally the literal s-symbol c; 

• What follows is a reasonable definition of a recursive procedure: 

Lisp 

1 (define factorial 

2 (lambda (n) 

3 (if (= n 0) 

4 1 

5 (* n (factorial (- n 1)))))) 



The anonymous procedure is evaluated and then globally named "factorial": 
the procedure has one parameter called "n", and its body is a simple condi- 
tional: if the result of calling the procedure "=" with the parameters n and zero 
is true, then the result is one; otherwise the result is the result of calling "*" 
with two parameters: n, and the result of calling factorial with the result 
of calling "-" with n and one. 

Of course a small set of predefined procedures must be provided if we want to 
perform arbitrary computation on s-expression data: in particular we will need to 
check whether a given s-expression belongs to an addend type (for example, the 
symbol? procedure returns a true s-expression iff its parameter is an s-symbol), 
plus constructors and selectors (for example, cons returns a new s-cons containing 
its two parameters; car returns the s-car of its parameter, which must be an s-cons); 
we also need a procedure eq? to check whether two given s-symbols are equal. 
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Given such predefined procedures, it becomes conceptually easy to work on sym- 
bolic information, including language transformers and interpreters. [51] contained 
the first Lisp interpreter written in itself as an ordinary procedure, in the space of a 
couple pages of code. 

All realistic Lisps also include some macro facility, usually Turing-complete: macros 
allow the user to define an s-expression-to-s-expression mapping for rewriting a syn- 
tactic form into a combination of already available forms; a macro may be thought 
of as a Lisp procedure to be automatically applied to all instances of a user-defined 
form, in some phase prior to execution. 

As a simple but not unrealistic example, since global procedure definitions and 
tests for zero are presumably very common, a user might prefer to be able to write 
the factorial definition above in a more compact way, as: 

Lisp 

1 (def ine-procedure (factorial n) 

2 (if-zero n 

3 1 

4 (* n (factorial (- n 1))))) 



User-defined forms still follow Lisp syntactic conventions '^: each use of the new 
forms def ine-procedure and if-zero is encoded as an s-cons whose s-car is the 
s-symbol uniquely identifying them. 

Macros are a form of syntactic abstraction (§1.3.1) allowing to factorize recurring 
code patterns; it should be obvious that procedural abstraction alone as provided 
by lambda and define does not suffice to express def ine-procedure and if-zero, 
since their s-expression subcomponents are not necessarily valid to be interpreted as 
expressions, and in any case they do not follow the call-by-value evaluation strategy 
of procedures. 

As builders of syntax from other pieces of syntax. Lisp macros are a prime example 
of symbolic computation, and a particularly good use case for quasiquoting. 

For example, assuming the three parameters of the macro if-zero above to 
be bound to the formals discriminand, then-branch and else-branch, the macro 
body might be as simple as '(if (= .discriminand 0) , then-branch 
, else-branch) . 

5.3.2 Critique 

The peculiar syntax of Lisp has always been a polarizing issue for users, either loved 
or despised with a violent fervor. Without trying to pass our personal opinions on 
the matter as science, we simply emphasize how powerful macro systems of the kind 
hinted at above are made possible by s-expressions and homoiconicity. 



^°Of course because of their different role Common Lisp "reader macros" [4, §2.2], a form of 
extension for the s-expression parser, do not fit our classification; Common Lisp "macros" do. 
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Syntax aside, some circles also perceive as a problem the apparent lack of ef- 
ficiency and the strongly dynamic nature of the language, including the glaring 
absence of static checks. 

As controversial topics do, Lisp has generated valid criticism and also plenty of 
noise with popular slogans, myths and half-truths. 

• Lisp has always been used for symbolic processing, its very name standing for 
"List processing"; many users consider it inherently inefficient out of the field 
of symbolic computation, because of its very high level. 

Of course Lisp is far from limited to "lists" (s-lists for us); in fact s-lists are 
but an s-expression subset, useful in practice but not any more "primitive" 
than others. More importantly, all practical dialects have also included ad- 
dend types such as random- access vectors and strings for decades; we avoided 
them in our presentation of s-expressions simply because such addends are not 
needed for encoding syntax, and this lack of a homoiconic role may actually 
contribute to make them less visible. 

Lisp can be compiled with reasonable efficiency, but some overhead due to its 
strongly dynamic nature is indeed hard to overcome. 

• In particular. Lisp is dynamically-typed at its core: there is only one data type, 
the s-expression. Apart from some runtime tagging and checking cost, the 
main perceived problem is the difficulty of proving any useful static properties 
on realistic programs. It is not clear whether the language can be made safer 
without seriously compromising its expressivity. 

We consider this criticism to be valid. 

• Popular claims according to which "Lisp programs are abstract syntax trees" 
or "Lisp has no syntax" (intended as a positive, negative or neutral remark 
according to the speaker) can be taken as poetic exaggerations at best. 

Equating valid expressions to ASTs is an oversimplification: in fact most s- 
expressions do not map into valid expressions, and the difference between 
s-expressions and abstract syntax is relevant in practice. The slogan would be 
slightly more believable if syntax were encoded as an ML-style sum-of-products 
type, with its rigid constraints on arity and typing — but that would come 
with a high cost in extensibility. 

Lisp syntax looks uniform when compared to traditional solutions, but it is not 
nearly as regular as it could be; for example the two atomic s-expressions 1 and 
a are interpreted in radically different ways, the first as a literal s-fixnum and 
the second as a variable. A literal s-symbol needs to be quoted as in ' a, while 
a literal s-fixnum may be indifferently quoted (once) or not: the s-expressions 
1 and ' 1 are mapped into the exact same expression. Procedure application 
syntax is also problematic: an s-expression such as (a b c) is regarded as a 
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procedure call only "as a fallback case", when the s-symbol a does not happen 
to be the name of some syntactic form. 

Could Lisp syntax be made more regular? Of course yes: as an alternative we 
could require form names as explicit s-symbols in the first position of s-lists 
also for variables and calls, and require quoting for all literals. Then instead 
of (* n (f (- n 1))) we would have something like " (call (variable *) 
(variable n) (call (variable f) (call (variable -) (variable n) 
'1)))", more uniform but hardly more convenient. Notation would remain 
clumsy even after introducing new s-expression prefix syntax for "variable" 
and "call" in the style of Syntactic Convention 5.6: for example, the very 
cluttered s-expression "@($* $n @($f @($- $n ' 1) ))" is a representation of 
the expression above, assuming an s-expression syntax amendment disallowing 
"$" characters in symbol names — without the syntax change we would need 
more whitespace, as in "@($ * $ n @($ f @($ - $ n '1)))". 

Lisp syntax is a compromise and a consequence of conscious design decisions 
rather than historical accidents, and these issues have been known for decades: 
[87, "{FUN CALL is a pain}", pp. 26-27] already deals with the problem of using 
"lists" both for procedure calls and for other forms. 

We have to recognize that Lisp notation in practice is useful and justified as 
it stands, despite its relative asymmetry. 

5.4 Syntactic extensions: the £i personality 

In the following we are going to build upon the experience of Lisp and address all 
three points in §5.3.2, so that: 

• e be efficiently implement able, and not especially tied to symbolic processing; 

• personalities remain open to any typing policy: strong, weak, static, dynamic, 
hybrid, or none at all; 

• e syntax be at least as convenient as Lisp's while remaining simple to describe 
and extend. 

For extensibility's stake, we use s-expressions to encode language syntax, as Lisp 
dialects do; but differently from Lisp we choose to decouple syntax and generic 
data structures, so that s-expressions are available as objects to compute just as 
one data type among a wealth of others: in practice data of each addend type 
are available either injected into s-expressions (for example, s-fixnums, s-symbols), 
or untagged (fixnums, symbols): thus s-expressions become a way of selectively 
employing dynamic typing in a world where untyped objects are also available, with 
injection and ejection operators to provide a link between the two representations. 
S-expressions are always used to represent syntax before macroexpansion, but a user 
is free to employ them at run time as well if she chooses to, where dynamic typing 
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feels more convenient. For generality's sake, we want s-expressions to be extensible 
so that the user may provide more addends. 

Expressions are just one addend type, distinct from s-expressions; eq expres- 
sions may be built and analyzed with constructor and selector operators, injected 
to and ejected from s-expressions. Said even more explicitly, in our solution we have 
that s-expressions are distinct from injected expressions; and macros act like proce- 
dures turning s-expressions into untagged expressions. Moving farther from Lisp we 
will also define transforms (§5.4.1.5), as a way of systematically turning (possibly 
extended) untagged expressions into other untagged expressions. 

The personality stack 

The language roughly outlined above constitutes a personality we call ei: 

• The ei personality corresponds to Eq augmented with forms to define globals, 
procedures, macros and transforms; plus some utility library. 

• Thanks to macros and transforms Ei is suitable to further extend into higher- 
level personalities. 

• We call £ the whole system, including Eq, Ei and other (at this point still 
hypothetical) higher-level personalities built on top of Ei. 

Higher-level personalities will contain macro and transform definitions in the style 
of the ones of §5.4.4, later in this chapter. 

As a language, Ei has an abstraction level between eq and Lisp, closer to the for- 
mer. Not necessarily aimed at the final user, Ei has a low-level feel and is by design 
unsafe and unforgiving: operators can be applied to the wrong operands with no 
type checking at all, and pointers are explicit. It lends itself to efficient execution, 
and is portable if used correctly. Ei is compatible with garbage collection but does 
not require it: the residual program resulting when all syntactic abstractions are 
transformed away might very well use manual memory management only. 

The implementation language of Ei is Eq, taking advantage of Scheme for boot- 
strapping only. The implementation forces us to commit some decisions which we 
had left open in the description of Eq in §2, such as the actual definition of names 
and handles in terms of data structures. All of this has a bearing on Ei, and in our 
solution the implementations of eq and Ei are intimately bound: an implementation 
of eo alone directly parsing the syntax of Definition 2.1, despite being certainly pos- 
sible, in practice would be little more than an idle exercise without the syntactic 
extension mechanisms of Ei. 

5.4.1 Definition via bootstrapping 

One central idea of E is to keep the core language as simple as possible, and have 
more complex linguistic features defined as code. As a consequence of this strategy. 
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a formal specification of £q automatically constitutes a formal specification of £i 
as well, if we keep into account the source code to bootstrap it from £q. Our im- 
plementation thus also serves as a specification of ei: code, rather than much less 
flexible mathematics. 

The bootstrapping process is nontrivial, and relying as it does on alternative im- 
plementations of the same data types, macros, side effects on a global state and 
unexec it provides a particularly poor fit for the graphical notation of T diagrams 
[52], [39, §3]; here we will resort to plain English to describe the bootstrap phases, 
and present the source code following by necessity a bottom-up style. 

The general plan, developed in greater detail throughout the rest of this section, 
consists of four phases: 

(i) extend Scheme by adding untyped data (§5.4.1.1); 

(ii) implement eq with s-expression syntax plus definition forms using Scheme 
macros (§5.4.1.2); 

(in) in this temporary eo implementation, build the core data structures we need 
upon untyped data, an self-interpreter relying on reflective global structures, 
macros and transforms (§5.4.1.3); 

(iv) fill reflective global structures by re-interpreting the core definitions above, so 
that the interpreter becomes usable (§5.4.1.7); 

As it should be clear now, developing ei from Eq up to the point where we can 
define s-expressions, macroexpansion and transforms requires a certain amount of 
code (about 2000 lines) in which we have to use Eq to build some machinery, much 
of which is useful as part of a generic utility library as well and hence deserves to 
be considered as "belonging" to Ei. Part of the "library" in Ei exists because of this 
necessity, while most of the rest relies on syntactic abstraction and is defined after 
the fourth phase, the aim being simply to make Ei more convenient to use (§5.4.4). 

The fourth phase, after which the global state can be queried, also makes it 
possible to unexec away from Guile into a different runtime (§3.3.2.1). 

In the following we are going to show code snippets from the implementation, which 
is available in a public bzr repository on GNU Savannah: https://savannah.gnu. 
org/bzr/?group=epsilon. We will usually omit or condense comments and may 
change indentation for reasons of space, but we will not simplify the code for this 
presentation. 

This discussion deals with the state of the implementation as of Summer 2012. 
5.4.1.1 Phase (i): extend Scheme with untyped data 

In order to eventually free ourselves from the dependency on Scheme, we need to 
define our own data structures which are not based on the predefined version of s- 

^^Of course up to the details we did not describe, such as primitives. 
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expressions. To simplify debugging and avoid reusing Scheme features by mistake, it 
is also useful to make our data structure incompatible with predefined s-expressions 
addends; and since we want to unexec in the end, our "untyped" data structure will 
actually need hoxedness tags (§3.3.2.1). 

We use Guile [21] as our Scheme implementation for bootstrapping. One of 
the intended applications of Guile is as an embeddable Scheme system to make C 
applications extensible in the style of Emacs [81], and in view of this use case Guile's 
C interface was made particularly convenient and flexible; we used it to define in C 
our new "type" that we call whatever, and operations over it. Boxedness tags serve 
only for the internal Guile garbage-collection machinery, at unexec time, and for 
debugging memory dumps (§3.3.1); but data structures built with whatevers should 
be thought of as untyped most of the time, as in fact they are conceived for being 
eventually unexeced into untyped objects, dropping any tagging information. 

Whatever operations help preventing possible mistakes during the bootstrap pro- 
cess, by actually performing dynamic checks on tags, in particular to prevent non- 
whatever objects from being written into whatever buffers: whatever data structures 
must remain closed over the "points-to" relation, so that no dependency on Guile 
s-expressions can remain at unexec time, and instead whatevers only refer other 
whatevers. 

Since Guile is a Scheme implementation its only data type is the s-expression, of 
which whatevers are seen as just one more addend type: in our extended Guile it is 
possible to dynamically check whether an s-expression is a whatever injection. 

The implementation of this phase, mostly in bootstrap/whatever-guile/whatever- 
guile, c, is dirty and not especially interesting in itself. We defined the whatever 
"type" in C as a Smoh [21, §Defining New Types (Smobs)]. Whatevers have the 
printed syntax of §3.3.2.1, also using ANSI terminal color escape sequences to help 
recognize whatever cases at a glance. 

The same C source file also defines operations over whatevers, making them ac- 
cessible to Scheme: there are trivial conversion operators (for example from Scheme 
(injected) fixnum or (injected) threads to whatever and vice- versa), plus what £q 
sees as primitives: 

• arithmetic and bitwise-logic operators; 

• memory allocation, disposing, lookup and update; 

• very simple input /output; 

• unexecing primitives, for checking the boxedness tags and buffer sizes; 

• the single primitive state : update-globals-and-procedures ! , needed for trans- 
forms (§5.4.1.5). 

Primitives number around 30. 
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The result of this phase is guile+whatever, an extended Guile which can also 
be used interactively, supporting our whatever objects while remaining completely 
compatible with Scheme. We will not show examples of its use, because some 
counter-intuitive choices were dictated by efficiency concerns; the details of the 
guile+whatever system become irrelevant anyway after phase (ii). 

5.4.1.2 Phase (ii): implement eq in extended Scheme 

Our implementation uses a variant of Eq in which the grammar of Definition 2.1 is 
augmented by one more production, for an indirect call form: 

e ::= [call-indirect e e*]/i 

We avoid a formal specification of semantics for call- indirect: the idea is simply 
calling a procedure whose name is computed at run time as the result of an expres- 
sion; parameters are evaluated call-by- value left-to-right as always in EQ) first the 
operator and then the operands. 

It is easy to convince oneself that adding call-indirect is a quite harmless 
optimization, as its effect can be easily simulated by automatically generating an 
apply function" dispatching over one of its parameters, as in Reynolds' defunc- 
tionalization [72]. In fact we do that as well, as a proof of concept in bootstrap/ 
scheme/core . e (§5.4.1.3). 

Before we can use Eq for implementing ei, we need of course a syntax for Eq. In 
typical bootstrapping fashion, we would like to define it using the language itself, 
Eq (or maybe ei, for maintainability's stake) — but no parser is available. Our 
solution is mapping an s-expression encoding of eo syntax into Scheme, by using 
Guile macros^^. 

Later we will provide another cleaner frontend implementation^^ in ei, to break 
the bootstrap dependency from Guile; that second frontend will be backward- 
compatible with this bootstrap implementation of Eq. 

As a consequence of this decision, it is natural for our implementation to use 
symbols for names, encompassing all the sets of variable, procedure and primitive 
names X, F, ITl. 

The following definition, very simple despite its length, follows the spirit of Syn- 
tactic Convention 5.7 but is more rigorous due to its importance: reading Eq syntax 
as encoded into s-expressions is key to understand most of the details in the boot- 
strap process. 

Since macros are not supported in this phase but the process is akin to macroex- 
pansion, we name this rewriting of an s-expression into an expression non-macro 

^^We used Guile's non-standard Common Lisp-style macros instead of the standard R5RS hy- 
gienic macros [41] which Guile also provides. This has no particularly deep reason except esthetic 
consistency with ei's macro system; an implementation based on hygienic macros would have 
worked just as well. 

^^This is not implemented yet: see §5.4.5. 
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expansion. The choice of generated fresh handles is immaterial in practice, so we 
speak of non-macro expansion as of a function. 

Definition 5.8 (non-macro expansion) Let s, si, S2, S3, S4 be s-expressions. Then 
we define, up to the choice of a fresh ^' e IH, the non-macro expansion function 
EeU : S - E as: 

• £^e(| (eO : variable s)D = Xh! where x = ejgymbolis); 

• E'lEd (eO : value s)D = Ch> where c = ej{s); 

• £'E(|(eO:let si S2 53)!) = [let xi....t„ be in e^jj/i' where {xi...Xn) = 
-ExsdsiD, = Ee(\s2\) and et^ = ^EdssD; 

• EE<\{eO:call si . S2)\j = [call / ehi...eh„]h' where f = ejsymbol{si) and 

• i^Ed • c^ll'iii'ii^ect • "S2)D = [call-indirect ehi---eh„]h' where 
eho = Ee<\si\) and <e/i,...e/j„> = Ees(\s2\); 

• ^EdCeO: primitive si . S2)D = [primitive tt e/i^.-.e/j^J/j' wheren = ejgymbolisi) 
and (ehj^...eh„) = £^Esds2D; 

• £^Ed(eO: if-in si S2 S3 •S4)[) = [if s {ci...Cn} then e/jj else Ch^lh' where 
ehi = -E^Ed^iD; <ci-.-c„> = Ecs<ls2\), e/i2 = Ee<\s3\) and e/13 = -E'Eds4D; 

• £^E(|(eO:fork si . S2)\) = [fork / ehi-.-ehJh' where f = ejsymbol{si) and 
<e/ii...eft„> = Ee4s2\); 

• E'EdCeO: join s)D = [join e/^J^/ where eu^ = Ee(\s\i; 

• we do not explicitly specify i^Ed^^^ -"^^f • ^)D/ 

• E'Ed (eO : bundle . s)\j = [bundle eh^...eh„]h' where {eh^...eh^) = ^^Esd'SD; 

• -BEdsD = -EEd^eO : variable s)D where s is an s-symbol; 

• Ee\{s\ . S2)D = -EEd(eO:call si . S2)D where si is an s-symbol not in {eO: variable, 
eO rvalue, eO : let, eO : call, eO : call- indirect, eO: primitive, eO: if -in, 

eO: f ork, eO : join, eO : bundle, el : define}; 

where the non-macro sequence expander -BEsd-D ^ S ^ E* is: 

• Ee40\, = 0; 

• Ee4(si . S2)D = eh-^.EE.s'\s2\) where = Ee(\si\); 
the symbol sequence expander -Exsd-D : S ^ X* is: 

• ExsdOHO; 

• £^xsd(si • 52)1) = x.Ex4s2^ where x = ejsymbol{si); 



5.4. Syntactic extensions: the ei personality 



89 



and the value sequence expander -Ecsd-D : S ^ C* is: 

• -EcsdCsi . S2)D = c.£'cs(|s2D where c = ej(si). □ 

The file bootstrap/scheme/epsilonO-in-scheme . scm implements non-macro ex- 
pansion with Scheme macros. After loading it from guile+whatever, Scheme and 
eo can be used together: 

• (+12) yields 3 as a Guile s-expression. 

• (eO :primitive fixnum:+ (eO:value 1) (eO:value 2)) yields the injected 
whatever 3, written in green as "3" (§3.3.1). 

It is worth remarking how Definition 5.8 does not define any self-evaluating atom, 
since doing so would create ambiguity with Scheme's predefined self-evaluating 
atoms: using eO: value in cases such as the example above is hence necessary, at 
this stage: for example (eO: value 2) generates 2 as an injected whatever literal 
constant, which is different from Guile's 2. 

By contrast it is not necessary to use eO: variable and eO:call for implement- 
ing variables and procedure calls, as a consequence of the fact that Scheme and 
share the same namespace for identifiers — at least at this stage. 

At this point Eq would be usable as an implementation language, if it provided 
some way of defining procedures and updating the global environment. A correct 
implementation of such facilities relies on reflective data structures and therefore 
belongs in Phase (in) or even later; but once more we can use Guile to solve the 
bootstrap problem and provide a temporary implementation of an el : define form. 

As for Scheme's define^^, we use the same form for defining either a non- 
procedure or a procedure, according to the shape of its s-cadr — respectively an 
a-symbol, or an s-list of one or more s-symbols. 

For a non-procedure definition, the second parameter is non-macro expanded, 
evaluated and the result bound to the symbol-ejection of the first parameter; for 
a procedure definition, the second parameter is non-macro expanded and bound as 
the body of a procedure whose name is the symbol-ejection of the s-car of the first 
parameter; the s-cdr of the first parameter contains an s-list of s-symbols whose 
ejections make up the procedure formals. 



^ An important difference with respect to Scheme is how our definition facility always works on 
state environments (§2.4.1), therefore at a global level, and can be invoked anywhere an expression 
can occur in the code, at any nesting level. By contrast Scheme's definition facility updates the 
"current" environment, which happens to be the global one only if the form is used at the top 
level. Implementing e's definition form over Guile required a relatively advanced and non-portable 
hack relying on Guile's module system. See the definition of def ine-object-from-anywhere in 
bootstrap/scheme/epsilonO-in-scheme . scm for the gory details. 
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Again, el: define is important for understanding tlie bootstrapping code and de- 
serves a more precise description. Without explicitly specifying a non-macro ex- 
pansion of an el: define form into an eo expression, we describe the behavior we 
require from such an expression: 

Axiom 5.9 (definition forms) Let si and S2 be s- expressions. Then, if = 
(el : define si 82)^: 

• if X = ejsymbolisi), e/i2 = -E'e(|s2^ and Ch^ T ||e <c> V then we have that 

-environment] ' 

• if {f, xi...Xn) = -ExsdsiD and e^^ = -Ee(|s2[) then we have that 



The two cases are trivially exclusive, as si cannot be an s-symbol and an s-list at 
the same time. 

The reason why we did not provide an explicit non-macro expansion for el: define 
in Definition 5.8 should be obvious at this point; since the actual implementation 
is in Scheme and its semantics is very clear, we have avoided writing a uselessly 
complex expansion into Eq assuming some global-updating primitive, even if that 
would have been possible; in particular it would have been very painful to provide 
an explicit encoding of Eq expressions as eq data; the problem will be dealt with in 
Phase (Hi) and in §5.4.4.3, where it will become relevant for the implementation. 

The actual Guile definition of el: define shows an interesting feature: after per- 
forming the binding, el: define updates global (Scheme) data structures keeping 
track of all the Eq definitions which have been performed, including procedure bod- 
ies. The need for this will become apparent in Phase (iv). 

5.4.1.3 Phase (in): build refiective data structures and interpreter in eq 

The purpose of this long phase is to define the global reflective data structures hold- 
ing the program state and then the interpreter. We start from the associated library 
functionality we need, using only Eq in its s-expression encoding and el : define. The 
task is complicated by the restrictions of the language, allowing for procedural but 
not syntactic abstraction. 

Equipped with Definition 5.8 and Axiom 5.9, the reader should be able to easily 
follow our running commentary on the main sections of bootstrap/scheme/core . e. 
Each section is delimited by a well- visible comment including a full line of semicolons. 

The general low-level "feel" of £1 becomes apparent right from the first sections: 
to create some order in the context of a global flat namespace, we adopt the conven- 
tion of having all procedure and global names begin with a reasonable namespace 
preEx delimited by a colon. Most of the procedures we define must work also af- 
ter unexecing, hence they must not rely on unexecing tags primitives: all the code 
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in bootstrap/scheme/core.e works on untyped objects ignoring any boxedness 
tags; and of course the distinction between booleans, characters or small fixnums is 

purely conventional and a whatever object may represent the number zero, the 
false boolean or even a mill pointer, according to the context: the machine repre- 
sentation after unexecing is exactly the same. 

To make eq more convenient to write, we usually define global procedures to wrap 
primitives, with their same names; either of the definitions of the test-for-zero pro- 
cedure whatever : zero? and the equality-by-identity procedure whatever :eq? in 
the first section Utility procedures working on any data is a good example: 



(el: define (whatever: zero? a) 

(eO :primitive whatever : zero? a)) 



Such definitions only serve to simplify calling syntax, for example allowing the user 
to write (whatever : zero? s) instead of (eO : primitive whatever : zero? s); in 

terms of procedural "abstraction power", they abstract very little. 

We start defining operations over the simplest types: the empty list object is 
simply the fixnum 0; booleans are represented as physical machines usually do, 
using for false (written #f) and any other value for true, including 1 which 
we also write as #t; in other words, we use generalized booleans; the procedure 
boolean :canonicalize canonicalizes a generalized boolean into either or 1. As 
a convention derived from Scheme, a question mark "?" at the end of a procedure 
name serves to remind the user that the procedure is a predicate, which is to say a 
procedure returning a boolean result. 

The section dealing with Fixnums contains primitive wrappers for arithmetic and 
bitwise operations, plus some very simple definitions such as minimum, maximum 
and a parity test; the only slightly more sophisticated procedures are the fixnum 
exponentiation (by squaring) procedure fixnum:**, and the base-10 logarithm. An 
annoying repeating pattern with conditionals is already visible at this point: we 
often have an eO : if -in form testing for a boolean condition, using {#f } as the con- 
ditional case set to discriminate between false and any other value: what we think 
of as the "else" branch is always the first one: 



(el:define (fixnum:** base exponent) 
(eO:if-in exponent (0) 
(eO: value 1) 

(eO:if-in (fixnum: odd? exponent) (#f) 

(fixnum: square (fixnum:** base (fixnum:half exponent))) 

(fixnum:* base (fixnum:** base (fixnum:!- exponent)))))) 



Unfortunately we cannot factor away this ugly pattern before introducing macros. 
The Buffers section contains more trivial primitive wrappers for memory-related 
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primitives: a buffer may be created by buffer : make, destroyed by buffer : destroy, 
read by buffer: get or updated by buffer: set!. Buffer size is only stored as part 
of boxedness tags and must not be accessed out of unexec, wliicli is the reason 
why we did not provide a procedure wrapper to conveniently extract it, lest it be 
used by mistake, buffer: get and buffer: set! have two and three parameters 
respectively: we chose to use explicit offsets (0-based, in words) rather than pointer 
arithmetics for accessing memory, in order to avoid making assumptions on memory 
management systems, which often constrain the use of inner pointers. As in Scheme, 
an exclamation mark " ! " at the end of a procedure name serves to conventionally 
remind the user that the procedure has side effects. 

The Boxedness section contains some functionality to check whether a word is a 
candidate pointer: for example fixnums below a fixed small constant, or fixnums 
not divisible by the word size in bytes cannot represent pointers on any modern 
byte-addressed machine (§3.3.2.1). 

Having defined buffers, it is very easy to define Conses: conses are simply two- word 
buffers with handy constructor, accessor and updater procedures. Differently from 
s-conses, conses as defined in this section do not necessarily hold two s-expressions: 
they are completely generic, mutable pairs of untyped objects: 



(el: define (cons:make car cdr) 

(eO:let (result) (buf f er :make-uninitialized (eO:value 2)) 
(eO:let () (buffer: set! result (eO: value 0) car) 
(eO:let () (buffer: set! result (eO: value 1) cdr) 
result)))) 

The nested zero-binding eO : let blocks simulating a statement sequence is another 
unfortunate recurring pattern which we are forced to live with until we introduce 
macros. 

The Lists section introduces singly-linked lists made of right-nested conses, and 
utility procedures to work with them. We "define" a list as either the empty list 
list: nil, which is to say 0, or a cons whose right element is another list. The 
quotes in the previous sentence are necessary: in keeping with ei's nature the fact 
that the right side of a list cons be another list is a pure convention, never enforced 
with static or dynamic checks. Faithful to the motto "garbage in - garbage out", we 
simply let the system fail at run time when a non-pointer, non-cons or a cons whose 
right side is not a list is used in place of a list — with a reasonable error message 
in the case of guile+whatever, but likely with a crude Segmentation Fault after 
unexec (§3.3.2.1). 

Of course we impose no constraint over the element shape, and lists are not 
necessarily homogeneous. Our utility procedures over lists include the usual oper- 
ations for appending, flattening, computing length, selecting by index. Thanks to 
eO : call-indirect we could have supported higher-order procedures (without non- 
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locals), but we refrained from doing so in this low-level core. 

A useful way of employing lists is to make association lists or alists (pronounced 
"ey-lists"), the only slightly delicate issue in our case being the way of comparing 
keys: the first, simplest and most efficient way is "by identity": the section AUsts 
with unboxed keys, defines procedure using single-word comparison for keys; this is 
always appropriate for unboxed or unique keys and when the key identity matters, 
but is not reliable in general with boxed keys where the same content may be repli- 
cated in more than one buffer. 

A vector is a pointer to a buffer with the first element reserved to store the payload 
element number, which is useful in many contexts where the size of a random- access 
sequence is not fixed, and we cannot rely on boxedness tags. The section Vec- 
tors provides procedures to lookup and update vectors, obtain their length, and 
some other utility operations including append, blit, and conversion to and from 
list. The procedure vector : equal -unboxed-elements? compares two vectors com- 
paring their respective elements by identity, which is the most common case. No 
bound-checking is performed, and elements might be heterogeneous. Vectors as de- 
fined in this section cannot be resized, as re-allocating them would change their 
pointer "identity". 

The next section deals with Characters and Strings: at this level, characters are just 
fixnums and strings arc just vectors — which entails the somewhat space-inefficient 
choice of having each character take one entire word in memory. Yet string support 
becomes computationally simple, and the wide range of each character suffices for 
supporting all Unicode code points, at a fixed width. Apart from some trivial I/O, 
string procedures are just trivial "aliases", or actually wrappers, of vector procedures: 

(el:define (string: equal? si s2) 

(vector :equal-unboxed-elements? si s2)) 

Having defined string support, we are ready to deal with the second kind of asso- 
ciation lists, in the SALists section: an salist (pronounced "ess-ey-hst") has strings, 
or other vectors with elements compared by identity, as keys. 

We then have a short section about Boxes: a box, similarly to an ML ref , encapsu- 
lates the idea of a mutable memory cell, implemented as a pointer to a single-word 
buffer. Utility procedures include support for incrementing a mutable counter, in 
case a box contains a fixnum. For example (omitting the other obvious variant 
box :biamp-aiid-get ! ): 

(el:define (box:get-and-b\imp! box) 
(eO:let (old-value) (box: get box) 

(eO:let (box:set! box (fixnum:l+ old-value)) 
old-value))) 
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Of course f ixnum: 1+ is a successor procedure, and f ixnum: 1- is tlie predecessor^. 

Witli alists and vectors at our disposal we are ready to implement Hashes hav- 
ing either unboxed objects or strings as keys: we also use a box to introduce a level 
of indirection making a hash easy to resize. A hash table is therefore a box referring 
a vector; the first payload element of the vector (after the vector "header" word 
holding the element number) is reserved to keep the hash element number, so that 
the fill factor is easy to compute at any time; the other vector elements are the hash 
buckets, implemented as alists or salists; of course the associated data may have 
any shape. As we need the hash function to be portable with respect to unexec, 
it respects the constraint in §3.3.2.2. Hashes are our most complex data structure 
so far: automatic resizing, and particularly comparing fill factors with a threshold 
using only fixnums requires some sophistication, but at around 250 lines our hash 
table implementation in Eq does not end up being overly complicated, despite out 
choice of avoiding higher order procedures leading to some code redundancy: 



(el:define (unboxed-hash: set ! hash key value) 
(eO:let () (eO:if-in (hash: overfull? hash) (#f) 
(eO: bundle) 

(unboxed-hash: enlarge ! hash)) 
(unboxed-hash: set-without-resizing! hash key value))) 
(el:define (string-hash: set ! hash key value) 

(eO:let () (eO:if-in (hash: overfull? hash) (#f) 
(eO: bundle) 

(string-hash: enlarge ! hash)) 
(string-hash: set-without-resizing! hash key value))) 



The two-way conditional performing a side effect in one branch and returning a 
"dummy" bundle in the other is another code pattern which we can't factor away 
without macros. 

Our first and most important application of hash tables is in the implementation of 
Symbols. Since we use them as identifiers, symbols are central in our design; they 
must be efficient to compare with one another, and as keys in associative structures 
such as the global (§2.4.2) and procedure (§2.4.4) state environments. 

The symbol table is a global hash mapping each symbol name, as a string, into 
a unique boxed symbol object with that name; by requiring that all named symbols 
be interned in the table, as all Lisps do, we obtain that symbol pointers can be safely 
compared by identity, just like unboxed objects; interning the same name more than 
once yields the same symbol object pointer: 



^''The names and for successor and predecessor come from the Scheme tradition, which 
we suppose inherited the convention from some Reverse-Polish stack language; to decrement the 
top stack element in Forth, for example, we may push 1 and then subtract, replacing the two 
topmost elements with the subtraction result. The Forth code is two words long: "1 -". 
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(el: define (symbol : intern name -as -string) 

(eO:if-in (symbol : interned-symbol-name? name-as-string) (#f) 
(symbol : intern-without-checking ! (vector: copy name-as-string)) 
(string-hash: get symbol:table name-as-string))) 

(el:define (symbol : intern-without-checking! name-as-string) 
(eO:let (new-symbol) (symbol :make-uninterned) 

(eO:let () (buffer:set! new-symbol (eO:value 0) name-as-string) 

(eO:let () (string-hash: set ! symbol:table name-as-string new-symbol) 
new-symbol) ) ) ) 



Following the example of several Lisps [4, 47] we also support uninterned symbols, 
which is to say symbol objects with no name and hence not occurring in the symbol 
table; an uninterned symbol pointer can be compared by identity with any other 
symbol pointer. 

The issue of what to store in the symbol object itself appears of little conse- 
quence: it is useful to point the symbol name string within the symbol object, to 
have an efficient mean of retrieving it when needed, typically for printing — but 
apart from this, the symbol object seems much more useful for its identity than for 
its content. And indeed, were it not for the problem of §3.3.2.2, we could safely use 
symbol pointers like "unboxed" hashed keys for state environments (§2.4.4) such as 
the global environment or the procedure table; but a better solution has been known 
for decades, described for example in [82] and in embryonic form already in [51]: 
the idea is to entirely do away with such global tables, and store the global data 
associated to each symbol within the symbol object itself; then the symbol object 
may be seen as a record, whose fields include the symbol name, the value in the 
global environment, the formal names of the symbol interpreted as a procedure, the 
procedure body, and so on. Where a pointer to a symbol is available, accessing it 
in a state environment only costs one load or store instruction with constant offset. 
In the interest of extensibility, we also keep one alist field in each symbol object, to 
which the user is free to add bindings^^. 

As a natural consequence of this design, in ei we may use the same symbol as 
a key for different state environments, for example the global environment and the 
procedure table, and use the same name for a global non-procedure and a proce- 
dure: the possibility of using the same name as key for two (or more) distinct state 
environments is what distinguishes a so-called Lisp-2 such as Common Lisp, from a 
Lisp-1, such as Scheme [32]. 

The Symbols section in the source code would be straightforward except for eg's 
painful lack of records, which at this point we still have to simulate with buffers. 



(el : define (symbol :make-uninterned) 

(eO:let (result) (buffer :make (eO: value 9)) 

^® An alist, later called "property list", was actually the only datum globally associated to symbols 
in [51] (p. 25), also containing a binding for the symbol name. Having fields at fixed offsets in the 
record object, out of the alist, may be regarded as an optimization. 
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(eO:let () (buffer: set! result (eO: value 0) (eO: value 0)) ; name 

(eO:let () (buffer: set! result (eO: value 1) (eO: value 0)) ; unbound in global environment 
(eO:let () (buffer:set! result (eO:value 2) (eO:value 127)) ; conventional unbound marker 
(eO:let () (buffer:set! result (eO:value 3) (eO:value 0)) ; empty formal list 
(eO:let () (buffer: set! result (eO: value 4) (eO: value 0)) ; no procedure body 
(eO:let () (buffer: set! result (eO: value 5) (eO: value 0)) ; no macro definition 
(eO:let () (buffer: set! result (eO: value 6) (eO: value 0)) ; no macro procedure 

(eO:let () (buffer:set! result (eO:value 7) (eO:value 0)) ; no primitive descript 
(eO:let () (buffer:set! result (eO:value 8) alist:nil) ; no extensions 
result)))))))))) 

(el: define (state : global-set ! name value) 

(eO:let () (buffer:set! name (eO:value 1) (eO:value 1)) ;; the name is bound as a global 
(buffer: set! name (eO: value 2) value))) ;; value 
(el:define (state : procedure-set ! name formals body) 
(eO:let () (buffer:set! name (eO:value 3) formals) 
(buffer: set! name (eO: value 4) body))) 



Only from this point on we can find instances symbol literals in the source, such as 
(eO:value foo): in the Scheme implementation of eo from Phase (ii), eO:value 
is a Scheme macro which, at Scheme macroexpansion time, generates an interned 
symbol using the functionality above. The same holds for string literals in the 
Strings section above, but the case of symbols is much more remarkable due to their 
greater complexity, and due to a global structure being involved. 

Finally we provide a functionality to automatically generate fresh symbols, par- 
ticularly useful for machine-generated code. Fresh symbols are interned and have a 
name starting with a prefix the user is not supposed to use in her own identifiers, 
currently We adopted this solution based on a convention rather than the alter- 
native of using uninterned symbols because of the need to extract all symbols from 
the symbol table (see for example state : global-names in §3.1); moreover interned 
symbols with a conventional prefix are easier to print and read, when needed for 
debugging. Assuming that "_"-prefixed symbols do not occur in user identifiers, 
generated symbols may be safely garbage-collected^"^. 

The Expressions section defines Eq expressions, as per Definition 2.1 plus call- 
indirect (§5.4.1.2), as a data structure. An expression may be conceptually seen 
as a sum-of-products in the style of ML, and in practice it is implemented as a boxed 
object: the pointed buffer contains an expression case tag in its first position; the 
expression handle, present in all cases, resides in the second element; the other ele- 
ment contents, and the buffer length, depend on the specific expression case. Where 
expressions require homogeneous sequences of undetermined length (for example 
bundle items and if conditional cases), by convention we always use lists. 

There are operators to build, inspect, update, and explode (obtain all compo- 

^^Interned symbols are not yet garbage collected at the time of writing. A solution is employing 
a second symbol table for for "."-prefixed symbols, implemented as a weak hash table [98]. However 
globals and procedures explicitly named by the user should cannot be safely destroyed in general, 
as they could be referenced in the future, possibly by dynamically-created expressions. 
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nents as a bundle) expressions. Such code could conceptually be written by hand, 
but due to its regularity and length, we chose to generate it with a Scheme program; 
the machine- written eg code is the first non-comment line in the section, easy to spot 
as the lone, huge 14000-character line with only the minimum required whitespace. 

Our expression as managed by the program-generated code has the exact same 
memory representation as an equivalent data structure defined (much later: see 
§5.4.4.3) by our general sum-of-product definition facility. 

The machine-generated expression constructors require to always specify all 
components, including handles. As the practice is tedious and we can easily gen- 
erate fresh handles (as fixnums), we also provide a set of hand- written wrappers, 
named after the symbols identifying the corresponding expression case in expansion 
followed by an "*" character. 



(el :def ine eO: handle-generator-box 

(box :make-initialized (eO: value 0))) 
(el: define (eO : fresh-handle) 

(box :buinp-and-get ! eO : handle-generator-box) ) 

(el:define (eO:variable* name) 

(eO : expression-variable-make (eO : f resh-hcoidle) name)) 
(el:define (eO:call* name actuals) 

(eO : expression-call-make (eO : fresh-handle) name actuals)) 



For example the expression which (eO:call p 57) expands to may be built 
by (eO:call* (eO: value p) (list : singleton (eO: value* 57))), which is not 
unreadable after considering how the literal constant expression which 57 expands 
to, being an expression, has a different representation from the fixnuni 57. 

It may be worth to explicitly stress how the Eq implementation of Phase (ii) 
does not rely at all on the Eq expression data structure as defined in this section. 

The next section State: global dynamic state, with reflection conceptually imple- 
ments state environments, but for the most part is actually a thin wrapper over 
buffer accessors used on symbol objects. For example, the following definitions 
concern the "procedure table", despite it not existing anywhere as a single data 
structure: 



(el:define (state : procedure? name) 

(state:procedure-get-body name)) #f iff unbound, which is to say return the body 
(el:define (state :procedure-get-formals name) 

(buffer: get name (eO: value 3))) 
(el:define (state : procedure-get-body name) 

(buffer: get name (eO: value 4))) 
(el:define (state :procedure-get-in-dimension name) 

(list: length (state :procedure-get-formals name))) 
(el:define (state : procedure-get name) 

(eO:bundle (state :procedure-get-formals name) 
(state :procedure-get-body name))) 
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(el:define (state :procedure-set ! name formals body) 
(eO:let () (buffer:set! name (eO:value 3) formals) 
(buffer: set! name (eO: value 4) body))) 
(el:define (state : procedure-unset ! name) 

(eO:let () (buffer:set! name (eO:value 3) list:nil) 

(buffer: set! name (eO: value 4) (eO: value 0)))) ;; make the body invalid 



It is particularly obvious now that our implementation of el : define from Phase (ii) 
could not update ei's state environments such as the global environment and the 
procedure table, which we have implemented just now: up to this point anything 
which has been globally defined in Phase (in) has been only set in Guile's state 
environments. This problem will persist until Phase (iv). 

By examining all the buckets in the symbol table it is easy to obtain the list 
of all interned symbols bound to a procedure, or a primitive. From this infor- 
mation, we can automatically build an apply function in the style of [72], plus 
another procedure in the same spirit for primitives. The automatically-generated 
procedures state: apply and state : apply-primitive (collectively appliers) each 
take two arguments, a symbol naming the object to call, and a parameter list; 
the result bundle is encoded as a list. The generated applier body consist in 
a deeply-nested conditional, comparing the first parameter with each applicable 
name: if the name matches, the corresponding primitive is called in the case of 
state : apply-primitive, wrapping the results into a list, while state: apply relies 
on the interpreter for evaluation. 

Generating appliers is our first example in the implementation of dynamically- 
generated code. Of course such generation heavily relies on dynamically-built £q 
expressions. 

With global tables and appliers, we are finally ready to implement a working inter- 
preter in the next section, epsilonO self-interpreter. 

The interpreter code is not overly complex and the main procedure eO : eval is 
but a long dispatcher, selecting the appropriate expression case and tail-calling a 
helper procedure. Its second parameter is the local environment, encoded as an 
alist: 



(el: define (eO:eval e local) 

(eO:if-in (eO : expression-variable? e) (#t) 

(eO:let (h name) (eO : expression-variable-explode e) 

(eO : eval-variable name local)) 
(eO:if-in (eO : expression-value? e) (#t) 

(eO:let (h content) (eO : expression- value-explode e) 

(eO : eval-value content)) 
(eO:if-in (eO : expression-bundle? e) (#t) 

(eO:let (h items) (eO : expression-bundle-explode e) 

(eO : eval-bundle items local)) 
(eO:if-in (eO : expression-primitive? e) (#t) 

(eO:let (h name actuals) (eO : expression-primitive-explode e) 
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13 (eO:eval-primitive name actuals local)) 

14 (eO:if-in (eO : expression-let? e) (#t) 

15 (eO:let (h bound- variables bound-expression body) (eO : expression-let-explode e) 

16 (eO:eval-let bound- variables bound-expression body local)) 

17 (eO:if-in (eO : expression-call? e) (#t) 

18 (eO:let (h name actuals) (eO : expression-call-explode e) 

19 (eO:eval-call name actuals local)) 

20 (eO:if-in (eO : expression-call-indirect? e) (#t) 

21 (eO:let (h procedure-expression actuals) (eO:expression-call-indirect-explode e) 

22 (eO : eval-call-indirect procedure-expression actuals local)) 

23 (eO:if-in (eO : expression-if -in? e) (#t) 

24 (eOrlet (h discriminand values then-branch else-branch) 

25 (eO : expression-if -in-explode e) 

26 (eO : eval-if -in discriminand values then-branch else-branch local)) 

27 (eO:if-in (eO : expression-fork? e) (#t) 

28 (eO:let (h name actuals) (eO : expression-fork-explode e) 

29 (eO:eval-fork name actuals local)) 

30 (eO:if-in (eO: expression- join? e) (#t) 

31 (eO:let (h future) (eO : expression-join-explode e) 

32 (eO : eval- join future local)) 

33 (eO:if-in (eO : expression-extension? e) (#t) 

34 (eO:let (h name subexpressions) (eO : expression-extension-explode e) 

35 (eO:eval-extension name subexpressions local)) 

36 (el: error (eO: value "impossible")))))))))))))) 



In order to keep the code understandable despite the deeply-nested conditionals, 
we chose not to assume generalized booleans in eO:eval, making sure that all the 
predicates we used only return #t or #f . 

Several helper procedures in their turn rely on eO:eval-expressions, which 
sequentially evaluates a list of expressions which have to return 1-dimension bundles, 
and returns the result list. 

Many interpreter procedures are strongly interdependent and mutually recursive, 
which is served quite well by procedural abstraction. It is very convenient to define 
mutually-recursive procedures without concern for the definition order, so that the 
programmer does not need to keep a call graph in her mind. 



1 (el: define (eO : eval-expressions expressions local) 

2 (eO:if-in expressions (0) 

3 list:nil 

4 (list: cons (eO: unbundle (eO:eval (list: head expressions) local)) 

5 (eO : eval-expressions (list: tail expressions) local)))) 

6 (el: define (eO: unbundle bundle) 

7 (eO:if-in (list:null? bundle) (#f) 

8 (eO:if-in (list:null? (list:tail bundle)) (#f) 

9 (el:error (eO:value "eO :unbundle : the bundle has at least two elements")) 

10 (list: head bundle)) 

11 (el:error (eO:value "eO:imbundle: empty bundle")))) 



Most helper procedures dealing with specific expression cases end up being simple: 
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(el: define (eO : eval-variable name local) 

(list : singleton (eO:if-in (alist:has? local name) (#f) 
(state : global-get name) 
(alist : lookup local name)))) 

(el: define (eO : eval-value content) 
(list : singleton content)) 

(el:define (eO:eval-if-in discriminand values then-branch else-branch local) 
(eO:let (discriminand-value) (eO: unbundle (eO:eval discriminand local)) 
(eO:if-in (list:memq discriminand-value values) (#f) 
(eO:eval else-branch local) 
(eO:eval then-branch local)))) 

(el: define (eO : eval -bundle items local) 
(eO : eval-expressions items local)) 

A possibly striking implementation choice consists in encoding Eq bundles as lists; 
this is necessary for examining bundle results, for example by testing their length 
— the fact that bundles are not denotable (§2.1.5) makes them hard to deal with 
directly in exchange for their potential efficiency in a compiled implementation. 
But ironically in this self-interpreter, where however performance is not a priority, 
the need of building lists for bundles entails a high rate of heap allocation, which is 
expensive. 

The self-interpreter does not rely on explicit stacks and is quite far from the 
semantics in §2.5; yet, of course without hope of certifying the implementation here, 
we claim that we believe it respects the semantics and implementation notes of §2. 

With the most fundamental addend types at our disposal we are ready to deal 
with general support for user-defined types, in the Type table section. 

We start with building support for tracking the extensible set of "types" recog- 
nized by the system, such as the empty list, booleans, fixnums, and conses; since in 
this context we assume dynamic typing, there is no need for type parameters: all 
the subcomponents of an s-expressions are tagged, at any level. 

As types tend to be relatively few in number and this reflective information is 
not particularly critical to performance, in this case we preferred a global table to 
the alternative of reserving fields in all symbol objects. 

Information for each type (empty list, boolean, fixnum, cons, string, ...) is en- 
coded in a descriptor record implemented as a buffer, also containing a unique tag, 
other information including a printer procedure, and once again an alist which the 
user may employ to add more fields as type-dependent attributes — it is unfor- 
tunately too early to define a general-purpose "extensible record" data structure, 
without any support for syntactic abstraction. 

The most interesting fields in type descriptor records is the expression-expander 
procedure name. An expression-expander specifies how to turn an object of the 
given type into an expression. Since of course we provide procedures to update the 
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type table, the user has the power to define and change the way addends expand, 
including, for example the mapping from a symbol into variable as discussed in 
Syntactic Convention 5.7 and §5.3.2. 

All support for macros, following Lisp syntactic conventions^*^, will be defined 
later in procedures called in its turn by the cons expander procedure; our predefined 
expanders will implement expansion in a way compatible with Definition 5.8. 

Definition 5.10 (expression-expander) Let Aq, A„_i be the addend types of 
S. Then the expander procedure for Aj or Ai-expander is a procedure of one param- 
eter returning one result. The procedure is guaranteed not to fail if the parameter 
has type Ai, in which case the result has type E. □ 

Most atomic objects such as the empty list, booleans and fixnums, are expression- 
expanded by sexpression: literal-expression-expander into a literal constant 
expression, which will finally allow the user to omit explicit "eO: value "s for non- 
symbol literals; sexpression: variable-expression-expander expression-expands 
a symbol into a variable expression; sexpression: expression-expression-expander 
trivially expression-expands an expression into itself. Cutting away some comments: 



(el:define (sexpression: literal-expression-expander whatever) 

(eO: value* whatever)) 
(el:define (sexpression: variable-expression-expander symbol) 

(eO : variable* symbol)) 
(el : define (sexpression: express ion-expression- expander expression) 

expression) 



The cons expression expander is not complicated either, and resembles Syntactic 
Convention 5.7 (p. 78) in regarding the procedure call as a "default case". It can be 
already seen from this code how even syntactic forms are implemented as macros: 



(el:define (sexpression: cons-expression-expander cons) 
(eO:let (car-sexpression) (cons: car cons) 

(eO:if-in (sexpression: symbol? car-sexpression) (#f) 

(el:error (eO:value "cons : expression-expander : the car is not a symbol")) 
(eO:let (car-symbol) (sexpression:eject-symbol car-sexpression) 
(eO:if-in (state :macro? car-symbol) (#f) 

The car is a symbol which is not a macro name: 
(eO:call* car-symbol (el :macroexpand-sexpressions (cons:cdr cons))) 
(el :macroexpand-macro-call car-symbol (cons:cdr cons))))))) 



Lines 3-4 show how the specific cons-expander above is not suitable for higher-order 
personalities where the operator can be encoded by an s-expression different from 
an s-symbol. Of course the user is free to replace the cons-expander at a later time. 



'^^Common Lisp also supports "symbol macros" [4]: some symbols defined by the user are 
macroexpanded like zero-parameter macro calls. Support for a similar feature can be added in 
El by changing the symbol, rather than cons, expander. 
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The S' expressions section deals with the implementation of s-expressions as data 
structures, and operations over them. Some of our procedures defined up to this 
point already call procedures working over s-expressions such as sexpression: 
symbol?, sexpression: eject and sexpression: inject-cons in procedure bod- 
ies — although of course our procedures calling not-yet-defined procedures have 
never been called themselves, yet. 

The specific memory representation of an s-expression object has always been 
seen considered important for efficiency in Lisp: all practical Lisps employ some 
form of bitwise tagging of unboxed objects, boxed pointers and/or buffer words, 
allowing to store in a compact way elements of the most common addends; such rep- 
resentation techniques are often complex (see [21, §Data Representation] for Guile's 
solution and [35] as a useful collection of "a body of folklore"), but such complexity 
is motivated by the need for tagging a//^^ data in Lisp. 

However the situation of £i is quite different from Lisp: s-expressions are mostly 
used for representing user syntax before macroexpansion, but not necessarily as a 
data structure after macroexpansion. Even if an efficient implementation is certainly 
possible, potentially by machine generation as in [75]) for the time being we make do 
with a quite literal implementation of Definition 5.1: we represent an s-expression as 
a pointer to a two-element buffer, whose first cell holds the type tag while the second 
holds the representation of the addend-type content. Some sample definitions: 



(el: define (sexpressioii:make tag value) 

(cons: make tag value)) 
(el:define (sexpression:get-tag sexpression) 

(cons : get-car sexpression)) 
(el:define (sexpression: eject sexpression) 

(cons : get-cdr sexpression)) 
(el:define (sexpression :has-tag? x tag) 

(whatever : eq? (sexpression:get-tag x) tag)) 

; ; We have generated unique tags when adding entries to the type table 
(el:define (sexpression:null? x) 

(sexpression:has-tag? x sexpression: empty-list-tag) ) 
(el:define (sexpression: boolean? x) 

(sexpression:has-tag? x sexpression:boolean-tag) ) 
(el:define (sexpression: cons? x) 

(sexpression:has-tag? x sexpression: cons-tag) ) 

(el:define (sexpression: inject-fixnum x) 

(sexpression:make sexpression:f ixnum-tag x)) 
(el:define (sexpression: eject-fixnum x) 
(eO:if-in (sexpression:f ixnum? x) (#f) 

(el:error (eO:value "sexpression:eject-f ixnum: not a fixnum")) 
(sexpression:eject x))) 



^® Advanced optimizing Lisp compilers such as SBCL [73] may actually avoid run-time tagging 
in some code points, in cases when a type inference analysis succeeds and in favorable contexts. 
This optimization, however, is not possible for the bulk of the code. 
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(el:define sexpressiontnil ;; an empty s-list object 

(sexpression:make sexpress ion: empty-list-tag empty-list : empty-list) ) 
(el:define (sexpression: car x) 

(eO:if-in (sexpression: cons? x) (#f) 

(el:error (eO:value "sexpression: car : not a cons")) 
(cons:get-car (sexpression:eject x)))) 

(el:define (sexpression: cadr x) (sexpression: car (sexpression: cdr x))) 
(el:define (sexpression: caadr x) (sexpression: car (sexpression: cadr x))) 

In our representation all s-expressions are boxed, and even traditionally "unique" 
objects such as the empty s-list or s-booleans may exist in more than one instance. 

We also define alternate versions of some procedures over fixnums and lists suit- 
able to work on s-fixnums and s-lists, as it is will be convenient later in macros to 
compute with s-expressions without explicit injections and ejections: 



(el:define (sexpression: 1+ x) 

(sexpression: inject-fixnum (fixnum:l+ (sexpression: eject-fixnimi x)))) 

(el:define (sexpression:reverse x) 

(sexpression : append-reversed x sexpression : nil) ) 
(el:define (sexpression:append-reversed x y) 
(eO:if-in (sexpression: null? x) (#f) 

(sexpression: append-reversed (sexpression: cdr x) 

(sexpression: cons (sexpression: car x) y)) 

y)) 



5.4.1.4 Macros 

Still following the bootstrap code in core . e, we are now finally ready to add support 
for Macros. 

The el rmacroexpand^'^ procedure, turning an s-expression into a correspond- 
ing expression, is but a trivial dispatcher tail-calling the appropriate expression- 
expander; but some expanders, as by the default the one for cons does, may involve 
expanding actual macro calls. 



(el: define (el :macroexpand s) 

(eO:let (tag) (sexpression:get-tag s) 
(eO:let (content) (sexpression: eject s) 

(eO: call-indirect (sexpression: type-tag->expression-expander-procedure-name tag) 
content) ) ) ) 

The general idea of macros is simple enough^^: the user defines each macro "in 
concrete syntax" as an s-expression, often relying on other macros. Before a macro 

^°The name "macroexpand" may not be entirely appropriate, but has long been traditional in 
Lisp circles. Even an s-expression containing no macro calls can be successfully "macroexpanded". 

^^Our mechanism is in practice not unlike the Common Lisp or Emacs Lisp macro systems, 
despite our explicit distinction between expressions and s-expressions. Common Lisp uses an 
auxiliary procedure "macroexpand- 1" returning two results: the result of expanding one call, and 
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call can be expanded the macro body itself must have been macroexpanded in its 
turn into an expression, which makes up the body of the associated macro procedure. 
At macro call time, the macro procedure is called by supplying it with the macro 
actuals; the macro procedure result, an s-expression, is then macroexpanded in 
its turn, which may involve expanding other macro calls. If the process does not 
diverge, the final result will be an expression. Since predefined macros allow to 
express all Eq forms our macro system is trivially Turing-complete, already because 
of macro procedures. Of course it is permitted, and useful, for a macro to return 
another macro call: this allows to build upon user-defined forms, "stacking" syntactic 
abstractions one onto another. 

Macro definition and lookup are easy enough, based as they are is on symbol 
objects similarly to the global and procedure state environments: 



(el: define (state :macro-set ! macro-name macro-body-sexpression) 
(eO:let () 

;; If we're re-defining an existing macro, invalidate its previous procedure: 
(eO:if-in (buffer:get macro-name (eO:value 5)) (0) 
(eO: bundle) 

(state : invalidate-macro-procedure-name-cache-of ! macro -name ) ) 
(buffer: set! macro-name (eO: value 5) macro-body-sexpression))) 
(el: define (state :macro-get-body macro-name) 

(buffer: get macro-name (eO: value 5))) 
(el: define (state :macro? name) 

(state :macro-get-body name)) ;; iff unbound, which is to say return the body 



The careful reader may have noticed a small difference in state :macro-set ! com- 
pared to the analogous code for procedures: no formal parameters names are pro- 
vided for macros. This absence is a conscious choice of ours, leading to a small 
simplification: as no nonlocal is ever visible from a macro body, parameter shad- 
owing is impossible and we can safely use the same formal name "arguments" for 
all macros. Moreover we can use one formal for all the parameters of a macro 
call by viewing them as an s-expression, which is to say the s-cdr of the macro call 
s-expression — for example (m a (324) 3) has parameters (a (324) 3) . 

Of course we will also add support for friendlier macros with named formals, 
later as a syntactic extension. 

As a concession to efficiency, we cache macro procedures, by generating them at 
the time of the first expansion of a macro call, and then re-using them. It is im- 
portant to specify this point, because in rare cases caching may have a observable 
effect on the result — that is the case of macros performing side effects very early, 
while building the returned expression. 

The corresponding code is surprisingly simple, ignoring the references to trans- 
formations for the time being: 



a boolean saying whether expansion should continue. In our case we can simply use expression- 
expanders, which in the terminal case will receive an injected expression. 
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(el: define (state :macro-get-macro-procediire-name macro-name) 

(eO:let (cached-macro-procediire-name-or-zero) (buffer:get macro-name (eO:value 6)) 
(eO:if-in cached-macro-procednre-name-or-zero (0) 

(state :macro-get-macro-procediire-name-ignoring-caclie macro-name) 
cached-macro-procedure-name-or-zero) ) ) 

(el : define (state : macro-get-macro-procedure-name-ignoring- cache macro-name) 
(eO:let (body-as-sexpression) (state :macro-get-body macro-name) 
(eO:let (untransf ormed-name) (symbol : fresh) 

(eO:let (untransf ormed-formals) (list : singleton (eO: value arguments)) 
(eO:let (untransf ormed-body) (el :macroexpand body-as-sexpression) 
(eO:let () (state :procedure-set ! untransf ormed-name 

untransf ormed-formals 
untransf ormed-body) 
(eO:let (transf ormed-name transf ormed-formals transf ormed-body) 
(transform: transform-procedure untransf ormed-name 

untransf ormed-formals 
imtransf ormed-body) 
(eO:let () (state :procedure-set ! transf ormed-name 

transf ormed-formals 
transf ormed-body) 
(eO:let (buffer:set! macro-name (eO:value 6) transf ormed-name) 
transf ormed-name) )))))))) 



A macro call expansion consists in macroexpanding one call into an s-expression 
and then tail-calling to a further macroexpansion of the result, which will hopefully 
terminate; the usual terminal case is an injected expression. 



(el:define (el :macroexpaiid-l-macro-call symbol arguments) 

(eO:let (macro-procedure-name) (state :macro-get -macro-procedure-name symbol) 
(eO: call-indirect macro-procedure-name arguments))) 

(el:define (el :macroexpand-macro-call symbol arguments) 

(eO:let (sexpression-af ter-one-expansion) (el :macroexpand-l-macro-call symbol arguments) 
(el :macroexpand sexpression-af ter-one-expansion)) ) 

Just for completeness we also show the trivial helper, called by the cons expander, 
which macroexpands an s-list of s-expressions into a list of expressions, left-to-right: 



(el: define (el :macroexpand-sexpressions sexpressions) 
(eO:if-in (sexpression:null? sexpressions) (#f) 

(list: cons (el :macroexpand (sexpression: car sexpressions)) 

(el :macroexpand-sexpressions (sexpression: cdr sexpressions))) 

list:nil)) 



We close by developing an illustrative and we hope not too artificial example. Let 
us assume to have somehow added an el :trivial-def ine-macro form for globally 
defining a macro, internally using state rmacro-set ! ; el :trivial-def ine-macro 
has two parameters: the macro name, and the macro body s-expression. We use 
el :trivial-def ine-macro to define our sample macro: 
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(el:define (sexpression: list3 a b c) 

(sexpression:cons a (sexpression: cons b (sexpression: cons c sexpression : nil) )) ) 

(el :trivial-def ine-macro silly-square 

;; We would write '(fixnum:* , (sexpression: car arguments) , (sexpression: car arguments)) 

;; if we already had quasiquoting . 

(sexpression: lists (sexpression: inject-symbol (eO:value fixnum:*)) 
(sexpression: Ccir arguments) 
(sexpression: Ccir arguments))) 



The silly-square macro takes at least one parameter (ignoring any one after the 
first) and returns an expression multiplying the parameter by itself; the resulting 
expression will contain two copies of the macroexpanded parameter, which therefore 
will be evaluated twice. 

For example, (silly-square 4 5 6) would eventually macroexpand to [call 
fixnum:* 4;^^ '^h'^lh^^ some fresh h'i,h'2,h'^ e H. 

When calling el : macroexpand on (silly- square 4 5 6) we immediately go 
through the cons expression-expander; assuming that silly- square is not a pro- 
cedure name, wc tail-call el :macroexpand-macro-call with two parameters: the 
symbol silly-square, and the s-list (4 5 6); el :macroexpand-macro-call at- 
tempts to expand the first call by using el :macroexpand-l-macro-call. Assuming 
silly-square has not been used before, state :macro-get-macro-procedure-name 
builds its macro procedure, which requires several expression-expansion calls not in- 
volving macros; state :macro-get -macro-procedure-name then returns the macro 
procedure name to el :macroexpand-l-macro-call, which calls it on (4 5 6); the 
result is the s-expression (fixnum: * 4 4) , which is returned by el : macroexpand- 1- 
macro-call; so control goes back to el :macroexpand-macro-call, which tail-calls 
el : macroexpand on (fixnum:* 4 4); by trivial expression-expansions, we finally 
obtain the expression [call fixnum:* ^u'^h'^- 

5.4.1.5 Transforms 

Many mathematical presentations deal with "transformations", meant as code-to- 
code functions. Our transform strategy adopts the same approach, with the sole 
significant extension of also permitting side-effecting procedures. 

When building a transform, the user or personality developer has to simply define 
ordinary procedures working on code, and then to "hook" them to the system. There 
are two reasonable ways of running such transform procedures: 

• a procedure can be applied retroactively to the current state, adding (and 
usually replacing) definitions; 

• or it can be installed, to be applied automatically in the future to each toplevel 
expression or each procedure created from that point on; since the composi- 
tion order is usually significant, the user can control where each transform 
procedure fits in a global list of names. 
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In general we are interested in transforming tliree different entities: expressions^ 
procedure bindings^ and global bindings. Transform procedures will need to be dif- 
ferent for each case, since the procedure interface cannot be compatible; but in our 
experience, "companion" transform procedures tend to rely on some common helper 
doing most of the actual work: for example, closure-converting an procedure binding 
involves closure-converting its body, which is an expression (§5.4.4.4). 

Global bindings are difficult to work with in practice, since they contain already- 
evaluated values with no fixed shape, rather than expressions; we did not yet find 
a use for global binding transform procedures, but we include such support for 
symmetry reasons. 

For generality's sake, we decided to have binding transform procedures also return 
a transformed name: this may be either the same as the original untransformed 
binding, or a new one. It might be convenient to keep the old definition around for 
debugging reasons, for example, but change all the uses of the old entity to new one, 
by systematically renaming references. 

A transform procedures may have one of three different interfaces: 

• one parameter - one result, to transform an expression; 

• three parameters - three results, to transform a procedure binding: name, 
formals, body; 

• two parameters - two results, to transform a global binding: name, value. 

The ultimate purpose of our code-rewriting system is to let the user write in an 
expressive high-level language, to be then automatically reduced to £q by "trans- 
forming away" extensions. The transform procedures mapping "syntax into syntax" 
therefore will need to support not only syntax, but extended syntax as data. We 
will show an elegant solution to this problem in §5.4.4.3, but we do not need to be 
concerned with it now while discussing the code which invokes procedure transforms. 

As a less obvious consequence of our design, side-effecting transforms provide for 
another interesting opportunity: a simple way of performing a code analysis is to 
implement a trivial transform procedure returning its parameters unchanged while 
recoding data in some global structure, possibly a global table with handles (§2.1.3) as 
keys. Transforms actually returning modified code might also store their parameters 
somewhere or simply record the relation between transformed an untransformed 
code^^ in some global structure, available for later debugging or analysis. 

Transforms also appear a convenient way to run some optimizations in which 
expressions are rewritten into more efficient versions. As a first "low-hanging fruit" 
we plan to use some heuristic search algorithm such as hill-climbing to search the 

^"^A practical problem of the current implementation which makes debugging difficult is the lack 
of a reverse mapping from code to its original untransformed form and ultimately to its the s- 
expression concrete syntax and source location. Solving this problem requires some care in writing 
transform, parsing and expression-expansion procedures, so that when an expression is built from 
others, its origin is somehow recorded in a graph. A linguistic extension to somehow automate this 
tracing process might be appropriate. 
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neighborhood of an expression for operationally-equivalent but faster versions. 

We are finally ready to present some of the code in the Transforms section. At 
around 150 lines the code is quite short and also very uniform, often present in 
three just slightly different versions because of the three entities to manage. 

The global lists of transform names to be applied in order are simple boxed 
global variables: 



(el:define transform: expression-transforms (box:make-initialized list:nil)) 
(el:define transf orm:procedure-transf orms (box:make-initialized list:nil)) 
(el:define transf orm: global-transforms (box:mcike-initialized list:nil)) 

(el :def ine (transf orm:prepend-expression-transf orm! new-transform-name) 
(box: set ! transf orm: expression-transforms 

(list:cons new-trainsf orm-name (box:get transf orm: expression-transf orms) )) ) 

(el : define (transf orm: append-procedur e- transf orm! new-transform-name) 
(eO:let () (box:set! transf orm:procedure-transf orms 

(list :append2 (box: get transf orm: procedure-transforms) 
(list : singleton new-transform-name) ) ) 
(state : invalidate-macro-procedure-name-cache !)) ) All macros have to be re-transformed 

The interaction with macros is interesting as it reminds us that an untransformed 
procedure may be incompatible with its transformed version (for example, in a CPS 
transform the argument number may change): it is hence important to invalidate 
any cached macro procedure, so that new ones are created, and subjected to the 
current transforms. 

Applying transform procedures is trivial; this is the code which gets executed 
when a procedure binding is transformed; the other two cases are essentially identi- 
cal. 



(el:define (transf orm:transform-procedure name formals body) 

(eO:let (transform-names) (box:get transf orm:procedure-transf orms) 

(transform :apply-procedure-transf orms transform-names name formals body))) 
(el: define (transf orm: apply-procedure-transf orms remaining-transf orms name formals body) 
(eO:if-in remaining-transf orms (0) 
(eO: bundle name formals body) 

(eO:let (transf ormed-name transf ormed-formals transf ormed-body) 

(eO : call-indirect (list:head remaining-transf orms) name formals body) 
(transf orm: apply-procedure-transf orms (list : tail remaining-transf orms) 

transf ormed-name 
transf ormed-formals 

transf ormed-body) ) ) ) 

Retroactive transformation is more interesting. The user will call transform: 
transform-retroactively! to install transform procedures for global and pro- 
cedure bindings, also specifying the names of some objects not to transform. 



(el : define (transf orm: transf orm-retroactively! globals-not-to- transf orm 

value-transform-names 
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3 procedures-not-to-transf orm 

4 procedure-transf orm-names) 

6 (eO:let (global-names) (list :without-list (state : global-names) globals-not-to-transf orm) 

6 (eO:let (procedure-names) 

7 (list : without-list (state :procedure-names) procedures-not-to-transf orm) 

8 (eO:let (transf ormed-name-global-list) 

9 (transf orm: compute-transformed-globals global-names value-transform-names) 

10 (eO:let (transf ormed-name-formal-body-list) 

11 (transf orm: compute-transformed-procedures procedure-names 

12 procedure-transform-names) 

13 (eO: primitive state :update-globals-and-procedures ! transf ormed-name-global-list 

14 transf ormed-name-formal-body-list) ) ) ) ) ) 



The code works by first computing all transformed bindings (the trivial helpers 
transf orm: compute-transformed-globals and transf orm: compute-transformed- 
procedures simply return a list of transformed bindings) without performing any 
global update; then, with a single primitive call, it activates all new bindings. 

Why having such a complex primitive written in C? And why do we have to 
compute all the bindings before applying any? The answer is that we need the state 
environment update to be performed atomically'^^ , again because of the incompatibil- 
ity introduced by some transforms. The alternative of updating global definitions in 
eo would fail when at some point the updater procedure itself or its helpers would be 
reached by the incompatible change wave, and break on a call from an untransformed 
procedure to a transformed one, or vice-versa. For this reason only, state :update- 
globals-and-procedures ! must be a primitive. 

The REPL section is the last interesting part of core.e. Its helper procedure 
repl : macroexpand-transf orm- and- execute can be given an s-expression to expression- 
expand, transform and evaluate: 



1 (el:define (repl :macroexpand-transf orm-emd-execute sexpression) 

2 (eO:let (untransf ormed-expression) (el :macroexpand sexpression) 

3 (eO:let (transf ormed-expression) (transf orm:transform-expression untrainsf ormed-expression) 

4 (eO:eval-ee transf ormed-expression) )) ) 



The REPL itself is very crude, and currently relies on a primitive io : read- sexpression 
calling Scheme from C to read a Guile s-expression and then convert it into our rep- 
resentation. This lack of a real frontend written in £i is the last remaining reason 
why we still depend on Guile after bootstrap (§5.4.5). 



1 (el:define (repl:repl) 

2 (eO:let () (string:write "Welcome to the epsilon REPL\n") 

3 (repl: loop (io : standard-input) )) ) 

4 (el: define (repl: loop port) 



^^Nothing to do with concurrency, in this case. Our current code does not even support syn- 
chronization primitives other than join, so background threads performing imperative operations 
are not used at all. 
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(eO:let () (string:write "el>\n") 

(eO:let (next-sexpression) (eO: primitive io:read-sexpression port) 

(eO:let (results) (repl :macroexpand-transf orm-and-execute next-sexpression) 
(eOrlet () (repl :write-results results port) 
(eO:let () (string:write "\n") 
(repl: loop port))))))) 



5.4.1.6 An aside: developing, testing, and the ordering of phases 

In this presentation we have chosen to show the final structure of our bootstrapping 
system as a working body of code, rather than recounting the process of writing it; 
the two views do not perfectly overlap. 

The preceding phase is by far the most problematic in this respect: as any reader 
with implementation experience may witness ei with its interpreter, global data 
structures, macro and transform systems is a very strongly recursive system, where 
each component tends to require all the others in a loop of circular dependencies 
apparently very difficult to break. And indeed, the preceding third phase was not 
easy to implement on the machine. 

After deciding on the general bootstrapping strategy we wrote a first approxi- 
mation of the system, in Eq with with no macros (see next phase) and no transforms, 
up to the interpreter included. Some subsystems, for example the implementation 
of sum-of-products types for £q expressions, were ffi'st prototyped in Guile. Trans- 
formations were added as the very last step, after macros worked reliably and were 
used to make £i considerably more friendly. 

With the marshalling/unmarshalling support needed for unexec (§3.3.2), we fol- 
lowed a route of progressively reducing the abstraction level: after writing its first 
version in ei using several comfortable language extensions, we translated it into 
eo, to make it possible to run it earlier at boot, when extensions are not loaded 
yet. The translated marshalling code is understandable, but some complexity which 
would have been a little too daring for Eq still shows up in the code, particularly in 
nested conditionals. 

Later we rewrote the marshalling and unmarshalling support for a third time in 
C, for performance reasons (§5.4.3). 

At the beginning we wrote a considerable body of debugging code in Scheme, includ- 
ing for example for example the procedure print -express ion writing expressions in 
eo's syntax of §2 including handles in Unicode subscript digits, or hash-dump-sizes 
which has served to test how well our hash functions distribute; and maybe most im- 
portantly meta: print -procedure -definition and meta: print -macro -definition, 
useful for inspecting the global state and obtain readable syntax. Such code is still 
available in bootstrap/scheme/conversion. scm, and still occasionally useful for 
debugging: 



guile> (meta:print-procedure-def inition 'cons:make) 
Formals : (car cdr) 
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[let [result] be [call buffer :make 2779] 7go in [let [] be [call buffer: set! result78i O782 
car783]784 in [let [] be [call buffer: set! result785 I786 cdr787] 788 in result789] 790] 791] 792 

guile> (meta:print -macro-definition 'eO:call) 

(sexpression: inject-expression (eO:call* (sexpression: eject-symbol (sexpression: car arguments)) 
(el :macroexpand-sexpressions (sexpression: cdr arguments)))) 

guile> (eO: value whatever : identity) ;; the symbol dump is painful to read 
0x1471040 [0xl4c41e0 [17 119 104 97 116 101 118 101 114 58 105 100 101 110 116 105 116 121] 
127 0xl409f00[0xl4633c0[0xl45c900[l 120] 127 0] 0] 0xl462b80[0 10 0xl4633c0] 
0] 



But as our design of £1 changed until its crystallization into the present form, some 
of our crude debugging and code-generating tools also broke down and became 
unusable, whenever their underlying assumptions failed. As old scaffoldings not 
supporting any more a structure now able to stand by itself, we abandoned them. 

Our bootstrapping code running on top of an inefficient extension to Guile had 
low performance, which was unsurprising. What we didn't expect was that waiting 
times were in practice so unbearable even on our fastest machine^^ that it necessi- 
tated optimizations already in this phase. §5.4.3 provides some insights. 

5.4.1.7 Phase (iv): fill refiective data structures 

Phase (in) consisted of about 2500 lines in eo, which we have executed on top 
of the eo implementation of Phase (ii) based on guile+whatever; in other words, 
our global definitions up to this point affected Guile's state environments^ rather 
than ours. This phase consists in using the Guile data structures we updated at 
each definition to fill "reflective data structures" — in quotes, since we are actually 
speaking of data to be stored as part of symbol objects. 

The code is in bootstrap/scheme/f ill-ref lective-structures. scm. 

Our Scheme implementation of el : define from Phase (ii)^ at the end of bootstrap/ 
scheme/epsilonO-in-scheme . scm, updates two global Scheme data structures: 
globals-to-def ine, a list of names of non-procedure globals which have been de- 
fined, and procedures-to-def ine, an alist binding each defined procedure name 
to its formals as a Guile list and its body as a Guile s-expression. The idea is to 
scan the lists and for each element to copy the corresponding data into our state 
environments. 

• Non-procedures are easy to manage: given a global name as a Guile symbol 
we simply have to look it up as a Guile global: the value we find, an injected 
whatever, has to be copied into the appropriate element of the ei symbol 
corresponding to the Guile symbol. 



optimum is a Dell Precision T7400 with two quad-core Intel Xeon (EM64T) chips at 3GHz, 
8Gb of RAM, heavily customized debian GNU/Linux "unstable". 
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• Procedures are more involved: for each one procedures-to-def ine contains 
its name as a Guile symbol, its formals as a Guile symbol list, and its body 
as a Guile s-expression. Name and formals are easy enough to translate, but 
for our state environment we need the body as an Eq expression encoded in the 
expression data structure we defined in Phase (in); converting each body into 
an expression is the main problem of this phase. 

At this point we can better justify our rigidly constrained way of writing code 
in Phase (Hi), in which we only used £q plus el: define: the s-expression-to- 
expression translation we need to perform at this point is non-macro expansion. 
Since the translation has to be executed only once without the translation code 
itself being part of the output, we can implement it in Scheme rather than in Sq. 
The procedure el :non-niacro-expand, defined in a mutually-recursive fashion with 
its helpers eO :non-inacro-expand-sexpressions, eO :non-inacro-expand-synibols 
and eO :noii-iiiacro-expand-values, follows very closely our Definition 5.8. 

The code is slightly less readable than the corresponding mathematical definition 
just because of explicit representation conversions between Guile's and ei's data; for 
example, the procedure whatever->guile-boolean converts an untyped ei object 
into a Guile dynamically-typed boolean, and guile-sexpression->sexpression 
converts a native Guile s-expression into our own representation, as per the previ- 
ous phase. All such conversion operators, by themselves quite unremarkable, are 
implemented in Scheme, in bootstrap/scheme/conversion. scm. 

Our eO :non-inacro-expand is also "unsafe" and in practice accepts a superset 
of valid syntax encodings: we avoided safety checks in the code, for example ignor- 
ing the s-cddr of (eO: join x . s) instead of verifying that it really is (). This 
expansion unsafety is not a problem in practice at this point, since the code to be 
translated has already been well tested on Eq^s implementation of Phase (ii), us- 
ing Guile's interactive REPL (actually guile+whatever's), and just a little care in 
reading untyped data structure dumps (§3.3.1). 

The real work is in the Scheme procedure set -metadata ! , a zero-parameter pro- 
cedure which consists of two loops, the first scanning the global binding list and 
adding the definition to symbols, and the second doing the same for procedures 
after s-expression conversion and non-macro expansion. 

Even on our optimum machine, when using Guile 1.8, which is faster in this 
phase, the computation of Phase (iv) takes about 15 seconds, compared to 0.2 sec- 
onds for the previous phases combined; fortunately, unless there are recent changes 
in core.e, we can in practice entirely skip this phase by execing (§3.3) over the 
Phase (ii) interpreter. 

One problem remains: the globals and procedures we have defined up to this point 
also remain in Guile's state environments, and this state of things will persist up 
until we remove the dependency on Guile. Re-defining some procedure directly 
invoked from Guile, would lead to subtle problems, making the two definition sets 
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inconsistent. We will simply avoid to override any ei definition with an incompatible 
one. 

With the caveat above, and now having global and procedure definitions in place, 
we can finally use el : eval. 

5.4.2 Unexec 

At this stage it is finally possible to use unexec, which depended on reflective struc- 
tures to dump a program. Our vague reference in §3.3.1 to the "surprisingly few" 
data structures involved should be clear now: a simple way of obtaining a program 
is to dump a pair containing: 

• the symbol table, holding all global and procedure definitions and from which 
all alive data in memory can be reached (§3.3.1); 

• the main expression. 

At exec time, it suffices to unmarshal the pair, define the symbol table, and run the 
interpreter on the expression. 

The eo implementation of unexec : unexec and unexec : exec is in bootstrap/scheme/ 
unexec. e; the same file also contains the Eq implementation of marshalling and un- 
marshalling. 

5.4.3 Optimizations 

In a preliminary version of ei, macros were not associated to procedures to be called, 
but to expressions to be evaluated. The current definition has a cleaner interaction 
with transforms, but if we ignore transforms the old solution was perfectly workable 
as well: instead of passing parameters to a procedure, we evaluated an expression 
in some environment, with the same effect. 

With the old solution, macroexpansion returned correct results, but the sys- 
tem incredible inefficiency led us to investigate the issue until we discovered a per- 
verse pattern: the complicated circular nature of the dependencies between el: 
macroexpand, expression-expanders, eO:eval and its helpers made it difficult to 
understand how, indirectly, eO : eval was interpreting calls to itself. 

It is easy to see how, if adding one layer of interpretation worsens performance 
by some constant factor k, we have that a stack of n interpreters has exponential 
overhead k'^; and given that symbolic interpreters easily cause order-of-magnitude 
overheads, the slowdown was evident even for very small values of n. 

For the first time in our programming experience we discovered that some code 
had unbounded interpretation overhead. Despite being now unnecessary because 
of the macroexpansion changes, we still find the problem and its solution quite 
beautiful, and potentially instructive for others. 
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Implementation Note 5.11 (The Hack) When evaluating a call to eO:eval, 
the self-interpreter does not evaluate eO : eval 's body, but directly the given expres- 
sion in the given local environment. □ 

The idea is simply to recognize as a particular case any call of a procedure named 
eO:eval: 



(el: define (eO : eval-call name actuals local) 

(eO:if-in (whatever : eq? name (eO:value eO:eval)) (#f) 

(eO : eval-non-eval-call name actuals local) 
(eO : eval-eval-call actuals local))) 

(el: define (eO : eval-eval-call actuals local) 

(eO:let (actual-values) (eO : eval-expressions actuals local) 

(eO:if-in (whatever : eq? (list:length actual-values) (eO:value 2)) (#f) 
(el:error (eO:value "eO : eval-eval-call : in-dimension mismatch")) 
(eO:let (expression) (list: head actual-values) 

(eO:let (local) (list: head (list: tail actual-values)) 

(list : singleton (eO:eval expression local))))))) ; wrap as inner eval would 



Subjectively, it could be said that The Hack changed the interpreter from being 
comically slow to being still very slow, but at least usable. 

Despite being an obvious idea, the following implementation aspect deserves promi- 
nence because of its dramatic impact on performance: 

Implementation Note 5.12 (interpreter in C) We re-implemented an eq in- 
terpreter in low-level C, using explicit stacks and no heap allocation, except implicitly 
for building whatevers. The C implementation is some hundreds of lines long, and 
performs runtime dimension checks. The interpreter is available from e\ as the 
primitive eO:fast-eval, and has the same interface of eO:eval. □ 

Replacing the eg self-interpreter with the C implementation led to as speedup of 
around 200 for an exponential-time recursive implementation of Fibonacci's func- 
tion: 



(el: define (fibo n) 
(eO:if-in n (0 1) 
n 

(fixnum:+ (fibo (fixnum:- n (eO:value 2)))) 
(fibo (fixnum: 1- n)))) 



The high speedup is not surprising, if we consider that the Eq self-interpreter had 
to run on top of Guile, itself an interpreter. 

The "Interpreter in C" strategy subsumes The Hack: being written in a different 
language than Eq, the interpreter never accesses its own body, and interpreter calls 
in the interpreted code are instead executed by a primitive, thus avoiding overhead 
multiplication. 
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Implementation Note 5.13 {exec/unexec in C) We re-implemented the mar- 
shalling and un- marshalling procedures unexec:dump and unexec : undump hy us- 
ing primitives written in C. The C implementation consists of about 100 lines, 
and adopts exactly the same data structures and algorithm as the corresponding 
£q code. □ 

Again, the optimization of Implementation Node 5.13 has an order-of-magnitude 
impact on performance: thanks to it, exec "quick-start" takes only a short fraction 
of a second on optimum. 

Re-implementing part of the functionality in C was an aid to development and 
rapid testing, more than a definite commitment: after a good native compiler is 
developed, the need for such optimizations will attenuate. 

5.4.4 Sample extensions 

The file bootstrap/scheme/toplevel-in-scheme . scm, run right after f ill-reflective- 
structures . scm, defines a few simple Scheme macros to let the user evaluate £\ 
forms within the Guile REPL: " (el : toplevel . s)" evaluates each element of the 
s-list s as an £i expression, which of course is macroexpanded and transformed be- 
fore execution. " (el :trivial-def ine-macro m s)", available both as a Scheme 
macro and as an ei macro, defines the macro named m (an s-symbol) as s (a generic 
s-expression) . 

Armed with just this knowledge, the reader should be able to follow quite easily 
bootstrap/scheme/epsilonl . scm, which contains around 2000 lines worth of £i 
extensions. 

We think that the power of syntactic abstraction is easy to appreciate now, by 
looking at how fast language expressivity improves after each definition, compared 
to the development work in phase (Hi) during which only procedural abstraction 
was available. 

This sequence of extensions, quite impressive in its accelerating rhythm, raises 
the language from the clumsy beginnings of £o to a respectable power, with se- 
quences, multi-way conditionals, short-circuit logical operators, Common Lisp-style 
destructuring macros, variadic procedures, tuples, records, extensible sum- of -product 
definitions, closures, imperative loops and futures. 

Interestingly, only the three last extensions in the sequence above depend on a 
transform; macros alone can already bring the language to a quite high level. 

Most of our syntactic conventions are inspired to Scheme, and the form names 
are indeed largely compatible, apart from the "el:" prefix. 

The very beginning of bootstrap/scheme/epsilonl . scm deals with macros for core 
eo forms: 
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; ; ; These first crude versions do not perform error-checking, silently 
; ; ; ignoring additional subf orms at the end. 
(el :trivial-def ine-macro eO: variable 
(sexpression: inject-expression 

(eO : variable* (sexpression: eject-symbol (sexpression: car arguments))))) 
(el :trivial-def ine-macro eO:let 
(sexpression: inject-expression 

(eO:let* (sexpression: eject-symbols (sexpression: car arguments)) 
(el :macroexpand (sexpression: cadr arguments)) 
(el :macroexpand (sexpression: caddr arguments))))) 



The code is simple, but it is questionable whether it really belongs in this file, 
rather than in core . e. The reason why we defined such important features so late is 
mostly pragmatic: such macro definitions would have been be much less comfortable 
to write using only state: macro-set! without el :trivial-def ine-macro. For 
similar reasons we defined quoting and quasiquoting in this file, rather than in 
core . e. 

The debate about where exactly the ei "core" ends and "extensions" begin looks 
futile anyway, and indeed the very notion of personality, possibly useful for humans 
to identify a set of features, has no consequence for the implementation. The same 
objection may be raised "at the other end", about the CPS transformation and the 
reason why we defined in its own source file instead of in the end of epsilonl . scm. 
In the same somewhat arbitrary fashion, we proclaim that continuations do not be- 
long in El but are part of another experimental personality based on ei. First-class 
continuations provide another qualitative jump in expressivity, but our implemen- 
tation is less mature and quite expensive in terms of performance, therefore less 
appropriate as part of the general-purpose "library" to build personalities which £i 
is meant to be. 

In the following we will just add some quick considerations about the main ex- 
tensions. 

5.4.4.1 Quoting and quasiquoting 

Quoting and quasiquoting, heavily relying on the type table (§5.4.1.3) so that support 
for new new types can be added smoothly, are different from their Lisp homologous: 
in £i a quoted or quasiquoted s-expression yields an expression which will build that 
s-expression when evaluated] for example '1 macroexpands to a procedure call of 
sexpression: inject-fixnum which, if evaluated, will build the s-expression (not 
the unboxed fixnum) 1. 

Despite this difference, we can easily adapt the standard algorithms for quasiquot- 
ing^^, which is convenient since nested quasiquoting is famously tricky to implement 
correctly. 

^^We followed Bawden's updated proposal (different from his older one in [7, §B]), as quoted by 
Kent Dybvig at http://www.r6rs.org/r6rs-editors/2006-June/001376.html. This new version 
was eventually adopted in [79]. 
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The non-homoiconicity of £i forces us to think of the difference between s-expressions, 

uninjected values and expressions, and costs us some injection and ejection oper- 
ators in macros. The inconvenience in practice is tolerable, and we consider the 
advantages of our syntactic extensions well worth this minor trouble. 

5.4.4.2 Variadic procedure wrappers 

All practical Lisps permit to define variadic procedures, which is to say procedures 
with an arbitrary number of optional arguments, eq and ei do not, for reasons of 
efficiency. Anyway we can recover the convenience of variadic calls by introducing 
variadic macros, and using them to wrap procedures. 

The following ei definitions extend binary operators with a neutral element to 
so that they work with any number or arguments: 



(variadic :define-associative fixnvmi:+ fixnum:+ 0) 
(variadic: define-right-deep fixnum:** fixniun:** 1) 

The macro-call expansions of variadic : def ine-associative and variadic : def ine- 
associative generate more macro definitions, in this case for "fixnum: +" and 
"fixnum:**", which for added convenience are also the names of the correspond- 
ing procedures. 

After the definition, using the debugging procedure meta:macroexpand we can 
examine how variadic calls are always "eliminated" at macroexpansion time, yielding 
efficient residual code: 



guile> (meta:macroexpand '(fixnum:+)) ;; no arguments: neutral element as a literal 

072631 

guile> (meta:macroexpand '(fixnum:+ 7)) ;; one argument: no calls are needed 

772532 

guile> (meta:macroexpand '(fixnum:+ 1 2)) ;; one sum 
[call fixnum: + I72533 272534] 72535 

guile> (meta:macroexpand '(fixnum:+ 12 3 4)) ;; three sums, left-deep (currently faster) 
[call fixnum:+ [call fixnum:+ [call fixnum:+ I72536 272537172538 372539372540 47254i]72542 
guile> (meta:macroexpand '(fixnum:** 2 3 4 5)) ;; three calls, right-deep as requested 
[call fixnum:** 272605 [call fixnum:** 3726O6 [call fixnum:** 472607 672608] 72609] 72610] 72611 

Since variadic syntax is so convenient, we use it also for of many other macros which 
are not procedure wrappers: 



guile> (meta:macroexpand '(el:or)) 

072464 

guile> (meta:macroexpand '(el: or a)) 

372465 

guile> (nieta:inacroexpand '(el:or a b c)) 

[if a72466 e {0} then [if b72467 6 {O} then C72468 else I72469] 72470 else I72471] 72472 
guile> (meta:macroexpand '(el: and a b c)) 

[if a72473 6 {O} then O72474 else [if b72475 6 {0} then O72476 else C72477] 72473] 72479 
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5.4.4.3 Sum-of-product types 

Sum-of-product or sum types are a kind of variant records, introduced by ML and 
popular in tlie functional programming community. 

Even in ei's untyped context, it is very convenient to automatically turn a sum 
description into procedures for building, accessing and updating data, and for test- 
ing the case of a given object. 

As a classic example, let a list be either nil, or the cons of a head and a tail: 



el> (sum: define my-list (nil) (cons head tail)) 
Defining the procedure my-list-nil . . . 
Defining the procedure my-list-nil?... 
Defining the procedure my-list-nil-explode . . . 
Defining the procedure my-list-cons . . . 

Defining the procedure my-list-cons-make-uninitialized. . . 
Defining the procedure my-list-cons-explode . . . 
Defining the procedure my-list-cons-get-head. . . 
Defining the procedure my-list-cons-with-head. . . 
Defining the procedure my-list-cons-set-head! . . . 
Defining the procedure my-list-cons-get-tail . . . 
Defining the procedure my-list-cons-with-tail . . . 
Defining the procedure my-list-cons-set-tail ! . . . 
Defining the procedure my-list-cons?... 



Our sum type definitions keep into account the number of cases which must be 
represented as boxed, and do not generate tag fields unless needed. We derive the 
representation in memory from the sum definition in a way similar to [5, §4.1]. 



el> (my-list-nil) ; ; unboxed 


el> (my-list-cons 10 (my-list-nil)) ;; just head and tail, no case tag 
0xlb984d0[10 0] 

el> (my-list-cons 10 (my-list-nil)) ;; make a *new* cons: different address 
0xlbe3b70[10 0] 

el> (my-list-cons 10 20) ;; "ill-typed" as a list: the system doesn't care 
0x1998350 [10 20] 

el> (my-list-cons? (my-list-nil)) is nil a cons? (No, it's not) 


el> (sum: define complex (cartesian real imaginary) 

(polar angle radius)) ;; two boxed cases: case tag needed 

;;; [...] 

el> (complex-cartesian 100 200) ;; first case: tag 
0xl52c0c0[0 100 200] 

el> (complex-polar 100 200) second case: tag 1 
0x1502620 [1 100 200] 



We can now redefine Eq expressions as an open sum-of-pro ducts, openness meaning 
that more cases can be added later. This permits more flexibility, at the cost of a 
slightly less efficient representation in the general case: 
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(sum:def ine-open eO : expression 
(variable handle name) 
(value handle content) 
(bundle handle items) 
(primitive handle name actuals) 

(let handle bound-variables bound-expression body) 
(call handle procedure -name actuals) 
(call-indirect handle procedure-expression actuals) 
(if-in handle discriminand values then-branch else-branch) 
(fork handle procedure-name actuals) 
(join handle future))) 



The representation is compatible with the one used in core.e, and from now on it 
will also be possible to add new expression cases, for user-defined expression forms. 

5.4.4.4 Closure Conversion 

The purpose of this extension is adding statically-scoped, higher-order anonymous 
procedures to ei, implemented as closures. 

Anonymous procedures require two^^ new syntax cases, lambda and call- closure: 



(sum: extend-open eO : expression 
(lambda handle formals body) 

(call-closure handle closure-expression actuals)) 
(el:define (el:lambda* formals body) ;; make a lambda expression 

(eO : expression-lambda (eO : fresh-handle) formals body)) 
;;; "Concrete syntax" for lambda, generating the new expression case. This is a 
;;; variadic macro of one or more arguments: body-forms is bound to the argument s-cdr 
(el :def ine-macro (el: lambda formals . body-forms) 
(sexpression: inject-expression 

(el: lambda* (sexpression: eject-symbols formals) 

(el :macroexpand '(el:begin ,(2body-f orms) ) ) ) ) 



Our closures are flat and minimal [24, p. 132], consisting of a single buffer holding a 
procedure name as its first element, followed by its zero or more nonlocal values; for 
us, nonlocals are the variables locally-bound out of the lambda-expression, occurring 
free in the lambda body, hence in particular not shadowed by the lambda formals. The 
procedure referred by the closure takes the closure itself as a parameter, followed 
by the ones explicitly mentioned as formals, and locally binds the nonlocal names 
by loading their values off the closure. 

For example (eO:let (a) 57 (el: lambda (x) a)) will yield a closure of two 
elements: a procedure name (automatically generated, two parameters: the closure 

^^Call-closure does not technically need to be added as a new syntactic case, as it would also 
be definable as a macro; only the lambda case has an expansion which depends on its context. 
However having both cases representable as expressions is useful for the CPS transform, and may 
be a good idea if case we want to add type analyses in the future. 
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and x), and the only nonlocal value 57. The procedure body will contain an eO : let 
block binding the name a to the second element of the buffer pointed by its closure 
parameter. 

Calling a closure is easy: given a closure and its actuals, we load the first element 
referred by the closure and we perform an indirect call to it, passing as parameters 
the closure itself followed by the actuals. 

closure : closure-convert implements closure-conversion; it needs a procedure and 
the set of locally-bound variables. The procedure is very simple, based on a multi- 
way conditional el:cond which dispatches on the expression case^^. el: let*, not 
to be confused with eO:let*^^, is a block binding sequentially: 



1 (el: define (closure : closure-convert e bounds) 

2 (el:cond ( (eO : expression-variable? e) 

3 (eO : variable* (eO : expression-variable-get-name e))) 

4 ( (eO : expression-bundle? e) 

6 (eO: bundle* (closure : closure-convert-expressions 

6 (eO : expression-bundle-get-items e) bounds))) 

7 ;;[...] other trivial cases [...] 

8 ( (eO : expression-lambda? e) ;; Interesting case 

9 (el: let* ((formals (eO : expression-lambda-get-f ormals e)) 

10 (nonlocals (set-as-list : subtraction bounds formals)) 

11 (old-body (eO : expression-lambda-get-body e)) 

12 (new-body (closure : closure-convert old-body 

13 (set-as-list : union bounds formals))) 

14 (used-nonlocals (set-as-list : intersection nonlocals 

15 (eO : free-variables new-body)))) 

16 ;; closure:make* defines a global procedure, then returns an expression 

17 ; ; which builds a closure data structure including the global procedure name 

18 (closure :make* used-nonlocals 

19 (closure : variables* used-nonlocals) 

20 formals 

21 new-body))) 

22 ( (eO : expression-call-closure? e) ;; The second interesting case 

23 (el: let* ((closure-expression (eO : expression-call-closure-get-closure-expression e)) 

24 (actuals (eO : expression-call-closure-get-actuals e)) 

25 (transf ormed-closure-name (symbol : fresh) ) ) 

26 (eO:let* (list : singleton transf ormed-closure-name) 

27 (closure : closure-convert closure-expression bounds) 

28 (eO : call-indirect* 

29 (eO : primitive* (eO: value buffer: get) 

30 (list: list (eO : variable* transf ormed-closure-name) 

31 (eO: value* 0))) 

32 (list: cons (eO : variable* transf ormed-closure-name) 

33 (closure : closure-convert-expressions actuals bounds)))))) 



Pattern-matching over sum types can be implemented with macros, and in fact we did that 
in a previous prototype: see §5.4.5. 

^*The naming convention is unfortunate in this case, but the sequential-binding name "let*", 
as distinct from parallel-binding "let", is convenient and has been conventional in Lisp for decades. 
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34 (else 

35 (el: error "unknown extended or invalid expression")))) 

36 (el:define (closure : closure-convert-expressions es bounds) 

37 (el: if (list:null? es) 

38 list: nil 

39 (list : cons (closure : closure-convert (list: head es) bounds) 

40 (closure : closure-convert-expressions (list: tail es) bounds)))) 



closure : closure-convert is the basis of our procedure transforms: 



1 (el: define (closure : closure-convert-expression-transform expression) 

2 (closure : closure-convert expression set-as-list : empty) ) 

3 (el: define (closure : closure-convert-procedure-transform name formals body) 

4 (eO: bundle name 

6 formals 

6 (closure : closure-convert body formals))) 
7 

8 (transform:prepend- express ion-transform! (eO : value closure : closure- convert -express ion-transform) ) 

9 (transf orm:prepend-procedure-transf orm! (eO:value closure : closure-convert-procedure-transf orm) ) 



Now that we have installed the transform procedures, we can use closures: 



1 el> (el:define q (el:let* ((a 1) (b 2) (c 3)) 

2 (el: lambda (x) 

3 (fixnum:+ a b c x) ) ) ) 

4 el> (el : call-closure q 4) 

5 10 



It should be remarked that closures are distinct from and incompatible with Sq 
procedures. Should we hide ordinary procedures from the user, and use closures 
only? 

We could: it is possible to introduce (trivial) closures for all existing procedures, 
retroactively transform away all procedure calls into closure calls (and then into 
indirect calls by closure-conversion) and finally change the cons-expander to generate 
a closure call rather than a procedure call as its default case. This would make £i 
similar to a "Lisp-1" [32] by hiding from the user the existence of procedures which 
are independent from closures. 

Such a move would be perfectly reasonable in many high-level personalities, but 
we reject it for ei, for which we want to retain low-level control. 

5.4.4.5 Futures 

Our fork form in is very inconvenient to use, needing a procedure which must 
be given parameters to evaluate in foreground, rather than just an expression (§2.2, 
p. 22). But closures make it easy to define friendlier futures, by a simple macro: 



1 (el: define (future : fork-procedure thread-name future-closure) 

2 (el : call-closure future-closure)) 
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; ; ; Build a future which will asynchronously call the given closure : 
(el: define (future : asynchronously-call-closure closure) 
(eO:fork future : fork-procedure closure)) 

(el : def ine-macro (el:future . forms) ;; friendly syntax: any number of forms in sequence 
' (future : asynchronously-call-closure (el:lambda () ,@forms))) 



5.4.4.6 First-class continuations 

We implemented first-class continuations with a CPS transform [83, 5, 44, 46] on 
expressions extended with a let/cc form ("CATCH" in [89]), with call/cc defined 
as a macro over let/cc. 

Our CPS transform is more tentative than the ei personality, and currently 
resides in bootstrap/scheme/cps.scm, and the trivial driver bootstrap/scheme/ 
cps-repl . scm. Implemented in a very conventional style, it works but yields in- 
efficient code and is inefficient at transformation time as well: in particular the 
high number of local variables generated by CPS stresses closure-conversion and its 
algorithm to compute the free variables of an expression, currently quadratic. 

The generated code allocates closures at a very high rate; it can be optimized 
and some improvements appear easy, but to obtain really efficient code we would 
need escape analysis, so that code sure not to escape could be recognized and trans- 
formed differently. Such global (or just "incremental") analyses can be performed in 
our model, by having a CPS transform return its provisional inefficient result but 
save the original untransformed code, to be reconsidered later. 

Bundles have been problematic, since CPS maps our eO : let form, which ignores 
excess items (§2.1.5, p. 20), into a procedure call, which does not ignore excess pa- 
rameters; in order to respect our eO : let semantics we had to relax some dynamic 
checks in the eo interpreter, and rely on a behavior which constitutes an error ac- 
cording to the semantics. It is not clear whether it would be best to update the 
semantics to ignore extra parameters (hence defining a non-error behavior in more 
cases, which constrains implementations^^), or to forbid bundles altogether in con- 
junction with CPS. 

Continuations have been very useful to test and stress our transform system, since 
a CPS transform is much less "well-behaved" than a closure-conversion transform: 
CPS adds one more argument to every procedure, making transformed code fun- 
damentally incompatible with its original version. When closures are not used, 
closure-conversion returns unchanged code, up to handles; but a (naive) CPS trans- 
form fundamentally changes the expression structure even where no jump is per- 

^'We only touched the C version, by trivially removing two conditionals, one for eO : call and the 
other for eO : call-indirect. The same change can be easily replicated in the £o self-interpreter. In 
such symbolic interpreters removing the check is trivial and actually slightly improves performance: 
this will not be true in a compiled implementation. 
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formed. Of course the CPS transform needs to be applied retroactively (§5.4.1.5, 
p. 106). 

We are not sure whether traditional "full" continuations are pragmatically the best 
foundation to base further extensions on; delimited continuations [33, 68, 27] seem to 
have advantages, and we have experimented with them in early prototypes; thanks 
to our open-ended design we may adopt them in the future. 

5.4.5 Implementation status 

The implementation is not mature, but it can be played with. We currently depend 
on Guile to parse and print s-expressions, and our current implementation still lacks 
a compiler. 

Such limitations are temporary and incidental: in the time available we chose 
to develop transforms, more innovative and interesting, rather than implementing 
well-known algorithms once again. We do not envisage any particular difficulty, and 
development will proceed during the following months. 

The older prototypes, unmaintained but available at http: //ageinghacker . 
net/epsilon-thesis-prototypes/, contain some code which could possibly be 
worth adapting and integrating into the current implementation: 

• an s-expression frontend written in OCaml for an older prototype, supporting 
the grammar of Figure §5.1; it works and contains a very powerful scanner 
supporting a variant of Thompson [91] and Rabin-Scott Constructions [69] 
over large character-sets; 

• an incomplete compiler including liveness analysis and RTL generation; 

• pattern-matching macros working on a different implementation of sum-of- 
product types; 

• a mostly complete CamlP4 printer, intended to automatically translate the 
OCaml code into maintainable ei code. 

An official part of the GNU project, epsilon is free software, released under the 
GNU GPL version 3 or later [31]. Its home page is http: //www. gnu. org/sof tware/ 
epsilon. 

We manage the source code on a public bzr server at bzr://bzr. savannah. 
gnu.org/epsilon/trunk, and a public mailing list is available for discussion. See 
https://savannah.gnu.org/projects/epsilon for more information. 

5.5 Future work 

Building a large body of extensions raises the issue of controlling their interaction. 
Transform-based extensions in particular, relying as they do on the enumeration 
of all expression forms, require knowledge of all the previously- added expression 
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forms. No solution to this problem is apparent. However, without promising a "silver 
bullet" to language extension, we still maintain our approach of layered syntactic 
forms to be much more suitable to extensibility than the traditional solution of a 
large unstructured collection of language forms. 

As an orthogonal problem, our current implementation does not currently keep 
a map from expressions to original source locations (§1.3.4), which may complicate 
debugging. Ad-hoc solutions involving an s-expression frontend keeping track of 
source locations, then to be threaded through macros and transforms up to the final 
generated code, seem perfectly feasible, with handles coming in handy; on the other 
hand it is desirable to keep extension definitions as uncluttered as possible, ideally 
by leaving the "current" location information always implicit at each stage, in a 
monadic fashion. A clean solution to this problem seems well worth investigating. 

5.6 Summary 

Lisp is a powerful language, and its homoiconic syntax based on s-expressions makes 
it easy to extend with macros. 

We adopted a form of Lisp-style s-expressions as a data structure to represent 
user syntax, but we keep it distinct from expressions: our macros map s-expressions 
into expression objects; then, going beyond Lisp, expression objects can be ma- 
nipulated by user-specified transform procedures, until all syntactic extensions are 
"transformed away" and only Eq forms remain. 

We have shown in detail how ei, a low- level e personality useful as a basis to 
build other extensions, is bootstrapped from eo temporarily leaning on Guile. Our 
bootstrapping code also constitutes a complete definition of the macro and transform 
systems. 

We closed by showing some interesting language extensions in ei, as representa- 
tive examples of our syntactic abstraction facilities. 
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A parallel BiBOP garbage 

collector 



When a high-level program requiring garbage collection runs in parallel on a multi- 
core machine, the memory subsystem easily becomes the bottleneck. For this reason 
we implemented a parallel collector for e, actually starting back when the current 
incarnation of the language was still taking shape, testing it on a toy Lisp imple- 
mentation we originally wrote as a teaching aid. 

The collector's performance profile is meant to best match a mostly-functional 
personality. It is relatively easy to interface to C systems, and by design is not 
limited to e. 

We call our system "epsilongc". As for the language name, the initial "e" is 
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Our parallel collector is non-moving, based on a variant of the BiBOP organi- 
zation. Building on the experience of Boehm's work and revisiting some older ideas 
in the light of current hardware performance trends, we propose a design leading 
to compact data representation and some measurable speedups, particularly in the 
context of functional programs. 

This effort results in a clean architecture based on just a few data structures, 
which lends itself to experimentation with alternative techniques. 

6.1 Motivation 

In recent years improvements in processor performance have been due more and 
more to increased parallelism, while the trend of rising processor clock frequency 
has dramatically slowed down. In contrast to what happens with instruction-level 
parallelism, the task parallelism offered by modern multi-cores must be explicitly 
exploited by the software, if any speedup is to be obtained [90]. 
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As multi-core architectures support a shared-memory model the techniques 
presented here extend from the now ubiquitous desktop multi-core machines to 
the older multi-socket SMPs, and to most recent medium-size parallel machines 
containing several multi-core CPU dies. 

The architecture we illustrate here is also suitable for sequential machine, but 
the need for such a software is particularly stringent in a parallel context. In a 
sense the rise of the number of CPUs amplifies the memory wall problem: the 
memory bandwidth is a limited hardware resource which all cores have to share, and 
raising the parallelism degree inevitably tightens up the bottleneck, even without 
any synchronization. 

6.1.1 Boehm's garbage collector 

Boehm's garbage collector [15, 12] is the natural point of comparison for our work 
because of several design similarities, including the idea of (partially) conservative 
pointer finding, and the use of Unix signals to interrupt mutators'^. For this reason 
it may be worth to quickly highlight the main objectives we have set forth for our 
implementation, in order to better explain the need for our effort and to illustrate key 
similarities and differences. Our objectives also more or less dictate several design 
and implementation choices which we prefer to make explicit from the beginning. 

First of all, C is clearly the language providing the best control on performance 
for such a low level implementation where each memory access matters. A slightly 
less obvious choice is determined by the typical usage of parallel systems, tending to 
concentrate on bulk processing rather than interactive applications: for this reason 
we consider bandwidth, and not latency, to be a priority; this choice excludes most 
incremental schemes and favors a stop-the-world model where many threads can 
mutate in parallel or collect in parallel, but without any time overlap between the 
two phases — all of which is similar to Boehm's solution. Since we are interested in 
the allocation pattern of functional programs, consisting in a large number of small 
objects, it is paramount to make a good use of the limited space in the primary and 

^The architecture shown here does not generahze so well to NUMA machines, more suitable 
as they are to a message-passing style where each task runs in its own addressing space; message- 
passing is also interesting, as the same interfaces could scale up to parallel computation over the 
network. 

Moving away from thread parallelism to pure process parallelism (one heap per process) would es- 
sentially eliminate the problem of parallel non-distributed garbage collection, but such a revolution 
appears unlikely. Other organizations like NUMA machines composed by SMP nodes, or machines 
where the NUMA effect is pronounced only between "distant" nodes, look more realistic and are 
already being adopted by some current high-class machines [23]. For such a hybrid SMP-in-NUMA 
model the techniques shown here apply at the SMP level, just in the same way as they would apply 
to each single machine in a cluster of SMPs. 

^Notwithstanding the outdated information at http : //www . hpl . hp . com/ personal/Hcins_ 
Boehm/gc/gcdescr . html Boehm's collector now also employs signals to stop mutators on all 
major platforms except Windows, where Unix signals are not supported but an analogous mech- 
anism exists for suspending a thread from another thread. [13] mentions GNU/Linux, Solaris, 
Irix and Tru64. The Windows implementation in win32_threads . c uses signal-like primitives like 
SuspendThreadO . 
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secondary caches (henceforth simply LI and L2), by tightly packing objects together: 
we want to avoid padding space between heap objects and not to force alignment 
constraints not specified by the user. Anyway, even if functional programs are our 
first concern, we would like our collector to be also useful for (human-written) C 
programs, which encourages us to adopt a non-moving strategy like mark-sweep and 
to avoid safe-points and use conservative pointer finding for roots; on the other 
hand there is no reason why other heap objects should not be traced exactly. The 
collector API should be usable by humans, but not necessarily similar to mallocO 
— an important difference with respect to Boehm's collector. 

6.1.2 High-level design 

Most of our implementation ideas rely on a variant of the classic BiBOP strategy 
[82, 26] which, despite its simplicity, has been exploited surprisingly little: the only 
discussion of an actually implemented similar solution that we have found is in a 
httle-cited 1993 paper by E. Ulrich Kriegel, [45]. 

In the different context of today, we propose BiBOP as a good match for modern 
multiprocessor architectures. 

We cannot claim novelty for most ideas, some of which are variations of very old 
implementation techniques, as it is understandable after fifty years^ of research. 

Nonetheless, we feel that our organization may have at least some aesthetic 
value, in terms of its data structures and C interface. 

Our main idea is that the BiBOP scheme is appropriate for reducing memory 
pressure on machines with modern memory hierarchies; we describe this point by 
introducing the concept of data density which we show to be at least one reason for 
the good performance of our implementation. 

6.1.3 The functional hypothesis 

Functional programs tend to allocate many small objects, the great majority of 
which have one of only a few possible "shapes"; in practice, most heap objects will 
be conses, nodes of balanced binary trees, or more generally components of inductive 
data structures with fixed size and layout, often containing some constant attributes 
which must be frequently inspected at runtime, such as the tags of our sum types 
(§5.4.4.3). Depending on the programming style closures might also be allocated in 
quantity; allocating other objects tends to be statistically much less frequent, hence 
less critical for performance. 

We define the above set of assumptions as the functional hypothesis: our system 
is designed to run most efficiently when such hypothesis is verified, yet epsilongc 
can and does work with any language, and may even be directly employed for user- 
written C programs. 



^We remark one last time how McCarthy introduced also garbage collection, in his wonderful 
[51]. 
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6.2 The user view: kinds, sources and pumps 

At a very high level, any automatic memory management system serves to provide 
an illusion of infinity: an unlimited stream of objects created on demand, each 
satisfying some specified requirements such as size and alignment. 

Objects which are not useful any longer can be simply ignored: there is no need, 
in general, for a user interface to the recycling system itself as the whole point of 
garbage collection is to make object reusing invisible to the user, who just keeps 
creating more objects as if the memory were unlimited. 

The user-level API is built upon three main data structures: the kind, each 
instance of which defines one particular set of requirements for a group of homoge- 
neous objects, the source, which arranges for the creation of objects of one specified 
kind, and the pump, providing a single mutator thread with objects from a given 
source on demand, one object at a time. 

6.2.1 Kinds 

We define a kind as the specific representation of a group of homogeneous heap 
objects. Each kind is characterized by a given object size, object alignment, a 
tracer function specifying how to mark the pointers contained in an object given its 
address, and particular metadata values: metadata include^ an integer tag and a 
pointer, sharing the same values for all the objects of the same kind. Given a pointer 
to a heap object, mutators are permitted to inspect, but not modify, its metadata. 

In general a kind should not be confused with a type: rather than a type it 
identifies one case among the potentially many variants which, together, make up a 
type. For example a cons kind could be defined, but not a list kind, which would 
also comprise the empty list case, having of course a different representation — by 
the way, reasonably unboxed, as in §5.4.4.3. 

The tag could be usefully employed in a dynamically-typed language such as 
Lisp, for example in order to test at runtime whether a given object is, effectively, a 
cons. In a statically-typed language like ML the tag can encode the constructor of 
tagged-sum objects. The pointer metadatum can be useful to refer any reflection- 
related data not fitting in a single integer. 

All the needed kinds are typically defined at initialization time, as global struc- 
tures shared by all mutator threads. 

6.2.2 Sources 

From the user's point of view a source can be seen as a global inexhaustible source 
of objects of a given kind. In the typical case the user will define exactly one source 
per kind at initialization time, as an object shared by all mutator threads; after 
initialization mutator threads will only refer sources to create their pumps. 



*Even if they currently comprise only tag and pointer, more metadata can be easily added in 
the future if the need arises. 
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6.2.3 Pumps 

A pump is a thread-local data structure implementing but one user-level functional- 
ity, the creation of an object. 

Each mutator thread will create its own pumps referring the shared, global 
sources, then use its pumps to obtain new objects. Pumps have to be explicitly 
destroyed at thread exit time. 

6.2.4 Kindless objects 

The strategy outlined above — creating objects of some kind which has been defined 
in advance — suffices the great majority of the objects ever created at runtime: for 
example in Lisp most heap-allocated objects will be (s-)conses, and Prolog heaps 
will mostly be made of terms. We call kinded all the objects created as shown above. 

Some other heap-allocated objects do not fit so well in the picture as it is not 
possible to foresee in advance their exact size: arrays and character strings come to 
mind^. We provide more "traditional" allocation primitives for such kindless objects. 

Notice how the kindless object API (see Figure 6.1) provides for less control: 
vector elements can be either all potential pointers, or they can be guaranteed by 
the user to include no pointers. There is not much control on metadata either: 
all objects share the same^ tag and metadatum pointer; a user requiring more ex- 
pressive metadata has to explicitly encode them in the payload. For reasons of 
general applicability and performance, we assume not to have boxedness tags avail- 
able (§3.3.2.1). 

6.2.5 Miscellaneous user functionalities: 

Other primitives are provided to initialize and finalize the collector, to register and 
unregister roots, to notify the memory system about new threads or exited threads, 
to explicitly force a collection, and to temporarily disable collections and re-enable 
them. 

As all of this is canonical and not particularly interesting, we will not further 
pursue such details. 

6.3 Implementation 

Despite their visual intuitiveness, the data structures above were designed primarily 
for efficiency, and the actual role of each structure is not apparent to the user: in 
particular the central data structure, the page, is completely hidden. 

^ Other slightly less obvious cases are procedure activation records, which some runtimes of 
Scheme, Prolog and SML allocate on the heap; if the language supports dynamic code genera- 
tion even code blocks (either machine language or bytecode) might be heap-allocated and garbage 
collected. 

®The actual values can be specified at initialization time, but nonetheless they must be the 
same for all kindless objects; it is typically reasonable to choose some values not used for kinds, so 
that at least kindless objects can be distinguished from kinded ones. 
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1 /* A tracer is a pointer to a function taking a pointer as its parameter and 

2 returning nothing: */ 

3 typedef void (*epsilongc_tracer_t) (epsilongc_word_t) ; 

4 

6 /* Create a kind: */ 

6 epsilongc_kind_t epsilongc_make_kind(const size_t object_size_in_words , 

7 const epsilongc_unsigned_integer_t 

8 pointers_per_object_in_the_worst_case, 

9 const size_t object_aligmnent_in_words , 

10 const epsilongc_metadatum_tag_t tag, 

11 const epsilongc_metadatum_pointer_t pointer, 

12 const epsilongc_tracer_t tracer) ; 

13 

14 /* Create a source from a kind: */ 

15 epsilongc_source_t epsilongc_make_source(epsilongc_kind_t k) ; 

16 

17 /* Initialize a (thread-local) pump from a source: */ 

18 void epsilongc_initialize_pump(epsilongc_pump_t pump, 

19 epsilongc_source_t source) ; 

20 

21 /* Finalize a pump before exiting the thread: */ 

22 void epsilongc_f inalize_puinp(epsilongc_pump_t pump); 
23 

24 /* Allocate a kinded object from a thread-local pump: */ 

25 epsilongc_word_t epsilongc_allocate_f rom(epsilongc_pump_t pump) ; 

26 

27 /* Lookup metadata: */ 

28 epsilongc_tag_t epsilongc_object_to_tag(const epsilongc_word_t object); 

29 

30 epsilongc_metadatum_pointer_t 

31 epsilongc_object_to_metadatum_pointer (const epsilongc_word_t object); 

32 

33 epsilongc_integer_t epsilongc_object_to_size_in_words (const epsilongc_word_t object); 

34 

35 /* Allocate kindless objects: */ 

36 epsilongc_word_t epsilongc_allocate_words_conservative(const epsilongc_integer_t size_in_words) ; 

37 epsilongc_word_t epsilongc_allocate_words_leaf (const epsilongc_integer_t size_in_words) ; 

38 epsilongc_word_t epsilongc_allocate_bytes_conservative(const epsilongc_integer_t size_in_bytes) ; 

39 epsilongc_word_t epsilongc_allocate_bytes_leaf (const epsilongc_integer_t size_in_bytes) ; 



Figure 6.1: epsilongc's essential user-level API. 
The source above is directly copied from header files, with only GCC function attributes (to 
force inlining and such) removed and comments eliminated. Despite looking unconventional 
the interface is not particularly complex, and in fact is conceived so that performance- 
critical operations such as epsilongc_allocate_f rom() and metadata lookup functions 
can be easily re-implemented in assembly, to be generated by a compiler as intrinsics. 
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Pointers are essential in the implementation of any language requiring dynamic 
memory allocation, and in order to make pointers easier to recognize at runtime in 
the absence of boxedness tags and their dereference more efficient"^, we restrict the 
set of heap pointers considered valid to word- aligned pointers; one word is also the 
minimum size of a heap object representable without space overhead, and all the 
integers internally used in the implementation are of type intptr_t, so that the size 
of all memory structures remains a multiple of a word size. 

The description below will proceed from the bottom up: since many data struc- 
tures and operations are usable with different collection strategies requiring little 
or no modifications, we illustrate the various possible operations before our way of 
combining them, in the spirit of separating policy from mechanism. 

6.3.1 Kinded objects 

We represent each kinded object as a buffer of words, with no header; the rationale 
of this choice is discussed in more depth in §6.3.8, but the main idea is simply to 
have long packed arrays of objects in memory, without any padding unless absolutely 
necessary^. 

6.3.2 BiBOP pages 

All kinded objects are allocated from data structures called pages^, similar to 
Kriegel's "STSS cards" [45]: whenever a pump returns a pointer to a new object, 
the resulting address will refer a word contained in a page. 

Each page can only contains objects of one kind. For each kind any number of 
pages, including zero, may exist at any given time. 

All pages have the same size, which must be a power of two; the page size is also 
equal to its alignment: the rightmost /o(72epsilongc_PAGE_SIZE_IN_BYTES bits 
of a page pointer are always guaranteed to be zero. 
A page is divided into page header, mark array and object slot array. 

^On many RISC architectures pointers to misaligned objects may not be just a performance 
concern: some processor families such as MIPS and Sparc simply raise an exception in response 
to any attempt to dereference a non-word-aligned pointer. Others, such as the x86 family, execute 
the misaligned dereference, but imposing a heavy execution time penalty. 

We prefer to simply forbid such pointers for all architectures, which may improve performance 
and helps to avoid the misidentification of many false pointers. 

We also assume convertibility from integer to pointer and vice versa without loss of information: 
even if not mandated by the C Standard (the type intptr_t itself is optional in [37]) such an 
assumption is in practice true on all architectures. 

*^Padding must to introduced sometimes in order to respect the alignment constraints stated 
by the user: for example the user might require a three-word structure to be aligned to two or four 
words; in such cases there is no way to avoid wasting some space for each object. 

^There is no a priori relation between BiBOP pages and operating system pages, whose sizes 
may well be different: BiBOP pages will typically be at least a few times larger than operating 
system pages, but still smaller than the L2 cache. In the following we always use the term page to 
mean "BiBOP page". 
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Page header The page header contains a copy of the kind metadata, which of 
course are vahd for ah the objects in the page; the object referred by the metadatum 
pointer, if any, is shared by ah the pages of the same kind: only the pointer is copied. 

Other information contained in the header includes kind-dependant data such 
as the object size and effective size, the payload offset, and the number of object 
slots in the page. All of this is computed once and for all when a kind is created, 
and simply copied at page initialization time. The address of the first dead slot (see 
below) is also held in the header. 

Since the header has offset zero within the page, given a pointer to any kinded 
object, even interior, the address of its page header can be trivially obtained by 
bitwise anding the pointer and the page mask, defined as the bitwise negation of 
epsilongc_PAGE_SIZE_IN_BYTES — 1. This allows mutators to access metadata 
at runtime with an overhead of two to four assembly instructions, when needed; 
on the other hand the negligible space overhead of storing metadata once per page 
makes this solution completely acceptable even for languages which don't make use 
of them. 

Mark array The mark array is placed right after the header, with no padding: 
since the header size is a multiple of the word size, the mark array is guaranteed to 
always begin at a word boundary. 

The mark array stores liveness information for each object^'' in the page: since 
we currently need only one bit per object, the array could conceptually always be 
implemented as a bit vector. 

As marking is parallel, mark arrays are concurrently updated by several threads, 
which requires some atomic memory accesses (see §6.3.6). On many machines byte 
stores are always atomic, and even when suitable atomic instructions for bitwise 
operations are provided working with a byte vector may be more efficient^^. On (hy- 
pothetical) architectures where the compiler did not support the required intrinsics, 
and where an atomic byte store were not provided, one could use a word vector. The 
implementation allows the user to choose at configuration time among bit, byte or 
word, bit being the default. 

Alternatively, it is possible to enable out-of-page mark arrays at configuration 
time, so that mark arrays are stored as separate mallocOed buffers; in this case the 
mark array area in a page degenerates to a single pointer, and accessing the mark 
array from a page requires one indirection. Our original rationale for implementing 
this strategy was to avoid some cache confiict misses due to the fact that mark arrays 

'^"it is interesting to compare this with Boehm's collector, which stores one element per object 
word, thus making tracing simpler. We have chosen to slightly complicate the mapping from mark 
array elements to objects instead, to speed up the critical operation of page sweeping, and in 
general trading more computation for fewer memory accesses. 

^^[12], written in 2000, compares the solutions on several architectures, finding that the optimal 
solution depends on the machine. According to our recent tests, the best strategy between bit 
arrays and byte arrays remains machine-dependent. 
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share the same ahgnment on all pages. As benchmarks showed that this is not a 
problem in practice with modern multi-way set associative caches, this strategy has 
not been pursued further by separating headers from slot arrays. 

Object slot array The object slot array begins after the end of the mark array, 
at the first word with the required alignment. Object slots contain the payload of 
each page. At any given time each object slot may be either used or unused: when 
used it contains an object payload; when unused, its first word contains a pointer 
to the next unused object in the same page, or NULL in the case of the last unused 
slot. 

For each page unused slots make up an independent free-list where elements are 
always ordered by address. 

In order to avoid mistaking free list pointers in unused objects for pointers in used 
objects during conservative pointer finding, free list pointers are stored in concealed 
form by default 

Concealing consists in applying some function c : A ^ to a free list pointer; it 
is important for c to be bijective, as concealing and then unconcealing (i.e. applying 
c^^ to) a pointer must preserve information. 

c is trivially implemented as a C macro computing the successor function in 
unsigned (wrap-around) arithmetic: since its domain consists of word- aligned point- 
ers, the elements of its image are guaranteed to be misaligned, hence they cannot 
be mistaken for pointers. The cost of applying either c or c^^ is one assembly 
instruction requiring no memory accesses^^. 

Depending on the kind, some unused space may be present between the end of 
the mark array and the beginning of the slot array, and at the end of the page; in 
either case these two padding spaces are strictly smaller than the object effective 
size. 

The global page table The global page table serves to recognize which part of 
the address space is being used for the garbage-collected heap; such information is 
important for avoiding dereferencing false pointers when doing conservative pointer 
finding. 

Moreover, the collector needs to be able to recognize whether a heap pointer refers 
a kinded object in a page slot array or a large object — no particular provision 
is needed for kindless small objects, but we defer the justification of this fact to 
§6.3.5. Since we support interior pointers for large objects, it must also be possible 
to efficiently map an arbitrary (word- aligned) interior pointer to an initial pointer. 

We call candidate pointer a word which is suspected to be a (possibly interior) 
object pointer at marking time, and candidate page the address of the hypothetical 

'^^Free list pointer concealing can be disabled at configuration time. 
Assuming instructions such as either inc/dec or add/sub with a small immediate parameter; 
again, all modern machines satisfy this condition. 
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page which would contain the object referred by a candidate pointer. Of course 
candidate pages have ahgnment /o52epsilongc_PAGE_SIZE_IN_BYTES. 

At an abstract level, the table implements a function / mapping a non-NULL candi- 
date page p to an element s of the disjoint sum 

Sort = {kinded} + {nonheap} + LargeObjects 

If f : p kinded then the candidate page p is actually a page; if instead 
f '■ p ^ nonheap then p is a pointer referring some object out of the garbage- 
collected heap, or a false pointer. Otherwise f : p i-^ I, where I is the address of the 
beginning of the large object containing the word pointed by p. 

Given a value for p stored as a key, a simple encoding allows us to represent any 
element of Sort in a single word: NULL represents nonheap, s = p stands for kinded, 
and any other value of s is interpreted as a large object pointer. 

The table is implemented as a simple resizable chained hash where the first 
element of each bucket is stored within the bucket pointer array itself as first 
described in [97]; the hash function is modulo. 

One essential optimization at mark time consists in not consulting the page at all, 
which would be comparatively expensive, for NULL or misaligned candidate pointers. 

It is interesting to notice how all updates to the global page table occur at 
mutation time, when creating or destroying^^ pages and large objects; unfortunately 
such updates require critical sections which, short as they are, may nonetheless limit 
scalability. By contrast at collection time the table is only read, which allows us to 
completely avoid critical sections for table access during that stage. 

6.3.2.1 Page creation 

Creating a page involves allocating space from the C heap, filling the header fields, 
initializing the mark and object slot arrays and registering the page in global struc- 
tures. 

Because of the alignment requirements we currently allocate pages with posix_ 
memalignO ^^; as this may involve a kernel call and/or synchronization in the C 
library, such operation tends to be both expensive and hard to parallelize. 

^^This optimization is the reason why we don't include NULL in the domain of /: we use the 
value NULL as a key in a hash table element out of the bucket to mean that the element is currently 
unused. 

^^See §6.3.6 for the reason why pages must be destroyed at mutation rather than collection time. 

^®An interesting alternative to explore would involve using minapO to allocate a group of pages; 
for some (non-GNU) implementations of posix_memalign() , the nimapO solution might incur a 
significantly lower space overhead, at the cost of always involving the kernel in page creation. 
Using nunapO could in fact make deallocation more portable, as freeOing buffers allocated with 
posix_menialign() is only permitted on GNU systems, as far as we know ([48], "Allocating Aligned 
Memory Blocks", currently at subsection 3.2.2.7). 

However the mmapO solution has some issues of its own: iiunapO only guarantees sysconf (_SC_ 
PAGESIZE) alignment, hence pages could only be reasonably mmapped in large groups, with some 
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Filling the header involves little more than copying some fields from the kind 
data structure, which is directly referred by the source, and making the free-list 
head point to the payload beginning. Nothing of this is performance-critical. 

The mark array has to be zeroed at creation, with a memsetO call. This should 
be relatively efficient, just involving some evictions from LI — however having the 
mark array in LI at page creation time does not buy us anything, as mark arrays 
are only touched during collection. If out-of-page mark arrays are enabled then we 
should add a mallocO call to the cost. 

Building the free list involves some memory traffic, as all objects need to be 
touched. Unless objects have effective size larger than a cache line the complete ob- 
ject slot array has to be brought into cache. Even if this phase by itself is expensive, 
it may work like a sort of prefetching: if the page is used soon, all of it will already 
be loaded at least in the L2 cache. 

We define backward free list building^''^ the strategy of building the free list starting 
from the last slot which will be used for allocation. This solution has locality ad- 
vantages in case of large page size, under the assumption that a just-created page 
will be used soon for allocating: if the page size is larger than the LI data cache, 
building the free list backwards makes it very likely that the memory touched first 
while allocating will be already in LI; the rest of the page will be still in L2. It is 
possible to choose between forward and backward free list building at configuration 
time. 

The final step is registering the page in the page table, which requires a critical 
section on the global mutex, plus a mallocO call within the critical section in case 
of hash collision. 

All of this makes page creation a relatively expensive and non-scalable operation. 
6.3.2.2 Page sweeping 

Sweeping can be performed on an individual page without need for synchronization 
or kernel calls. It simply involves scanning the mark array and, for each i-th element, 
either clearing the corresponding element if array[i] is one, or making the i-th 
object slot unused by re-adding it to the free-list if array[i] is zero. Since the mark 
array is examined in order (either forward or backward, as per the free-list building 

space overhead at the beginning and the end. Making epsilongc_PAGE_SIZE_IN_BYTES equal to 
sysconf (_SC_PAGESIZE) would solve the space overhead problem, but at the price of forcing pages 
to be unacceptably small, unmmapping space from the middle of a mmapped buffer is supported, but 
deallocation of single pages would still be a problem unless epsilongc_PAGE_SIZE_IN_BYTES were 
chosen to be a multiple of sysconf (_SC_PAGESIZE) . Re-mmapping a previously uniranapped part of 
a buffer is typically supported, even if such behavior is not mandated by POSIX. In addition we 
would need some data structure to keep track of which pages in a large buffer are mmapped at any 
given time. 

Anyway, despite all the complexity, such an idea seems worthy of some exploration. 

^^The actual direction of free list building, from higher addresses down to lower ones or from 
lower addresses up to higher ones, has no effect on performance as long as it is the opposite of 
the allocation direction: note in particular how automatic hardware prefetching works in either 
direction on modern processors ([23], section 3.3.2, "Single Threaded Sequential Access"). 
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direction), free list elements are kept ordered by address in the list. All the words 
of dead objects other than the first one are overwritten^^, to prevent future false 
pointers referring the slot to keep alive the objects which were referred by the now 
dead slot. 

Memory access patterns in sweeping are similar to the ones in mark array initial- 
ization and free-list construction; in particular a just-swept page will likely remain 
cached at least in L2 — and the next lines to be used will be in LI, if backward free 
list building is enabled. 

6.3.2.3 Page refurbishing 

It is possible to re-use an empty page of some kind for objects of another kind: such 
operation is called refurbishing, and involves reconstructing the header, mark array 
and free list. 

Refurbishing has essentially the same overhead as sweeping, and the cache effects 
of the two operations are also comparable: allocations from a just-refurbished page 
on the same thread which performed the refurbishing is efficient as all the page cache 
lines will still be in LI and L2. 

6.3.2.4 Page destruction 

Destroying a page involves its deallocation and removal from the global page ta- 
ble: such operations are expensive and non-scalable, involving synchronization and 
possibly kernel calls. 

6.3.3 Sources 

From the implementation point of view a source is quite a trivial structure, serving 
as repository of pages. Each source simply contains two lists of pages, the full pages 
list and the non-full pages list, plus a mutex for synchronizing access to such lists. 

6.3.4 Pumps 

Pumps are performance-critical structures whose purpose at the implementation 
level consists in caching frequently accessed data about the objects to allocate. 
Such criticality is evident from the API in Figure 6.1, showing how existing pump 
data structures are initialized rather than dynamically allocated, in an effort to save 
a pointer indirection at runtime: pumps are conceived to be declared in programs 

as thread variables of type struct epsilongc_punip, rather than as pointers. 

At any given moment a pump may conceptually "contain" a page reserved to the 
allocating thread, or no page; of course at the implementation level such an inclusion 

^*Each word is overwritten with a configuration-dependent value impossible to mistake for a 
pointer: either the Oxdead constant (which is easy to recognize for humans) if the collector is 
configured in debug mode, or otherwise simply (which might lead to a slightly more efficient 
implementation on some architectures, possibly saving a load immediate instruction). Overwriting 
dead slots can also be completely disabled at configuration time. 
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is represented with a page pointer field. Its other relevant field is the current head 
of the page free list, again kept in the pump rather than in the contained page in 

order to avoid a pointer indirection at allocation time: in fact the free-list head field 
of the page is, counter-intuitively, not updated at each allocation. The free-list head 
field of the pump is set to NULL when the pump contains no page. 

6.3.4.1 The allocation function 

Despite the allocation being the only user-level operation on a pump, such a func- 
tionality is very performance-critical. Allocating from a given pump involves un- 
concealing the free-list field into a temporary variable, if non-NULL dereferencing it, 
setting the free-list head to the just loaded value and finally returning the tempo- 
rary. This shorter and far more common execution path is carefully optimized and 
costs about ten assembly instructions, with no taken^^ jumps; the other execution 
path is taken in case of page change time, when a page is filled and another one 
must be acquired from the relevant pool, or at the first allocation for a pump with 
no page: it involves synchronization with the pool mutex and access to its lists. If 
no non-full pages are available, a page is taken from a global empty pages list (at the 
cost of one further synchronization) and refurbished if needed. If no empty pages 
are available, an heuristic is employed to decide whether to create a new page, or 
to trigger a collection. Page change is also the taken as the occasion for destroying 
empty pages, if an heuristic says that there arc more than enough: the rationale 
here is to avoid destroying pages too frequently, since they might be needed again 
and both creation and destruction are expensive. 

Repeatedly allocating from a page which was recently swept by the same thread 
and which contains many unused slots should be cache-friendly: sweeping works like 
a prefetch phase to load the page payload into the LI or L2 cache, and even without 
on-demand sweep the hardware automatic prefetch may be activated when there is 
much free space on the page, as consecutive addresses are generated. Using pumps 
automatically guarantees that a page is only used for allocation by one CPU at a 
time, which avoids cache ping-pong. 

6.3.5 Kindless and Ictrge objects 

The data structures and primitives shown above provide no hints about the imple- 
mentation of kindless objects, yet the idea is quite simple. A set of implicit kinds, 
sources and per-thread pumps^^, of user-definable sizes, are automatically defined: 
in this sense most kindless objects are just kinded objects "in disguise", only slightly 
less efficient because of the need for mapping an object size to a pump at runtime, 
and because of the possibility of internal fragmentation: not all possible sizes will 
be realistically provided, so the allocation of an object of a given size might be 



It is worth to give GCC an optimization hint with builtin_expect () . 

^°Implicit pumps are created at thread registration and destroyed at thread un-registration 
time. 
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satisfied by using a larger buffer. For eacli size two kinds are provided, one witli a 
fully conservative tracer, and another one with a leaf tracer (called "atomic" in the 
jargon of Boehm's collector). 

It is easy to see how the solution above is not completely general, as it cannot 
satisfy allocation requests for objects larger than a page or even just larger than 
the maximum implicit kind size which has been fixed by the user. A different 
mechanism is provided for large objects, which are simply allocated one by one 
with mallocO and destroyed with free(). Their implementation is simple-minded 
and quite inefficient in both space and time, which given the functional hypothesis 
should hopefully not be serious. Of course the user-level API completely hides the 
difference between implicitly-kinded and large objects. 

6.3.6 Garbage collection 

A collection is initiated by one mutator, which stops all the other mutators with a 
signal. This choice has the advantage of allowing a simple user API, but significantly 
complicates the collector implementation: any function not reentrant with respect 
to signals, notably including mallocO and freeO, can not be used at collection 
time: this is the reason why empty pages have to be destroyed at mutation rather 
than collection time. 

The collection phase may internally proceed in two different orders according to 
a configuration option: if on-demand sweeping is enabled, as per the default, the 
three sub-phases are non-deferred sweeping, root marking and marking, otherwise 
they are root marking, marking and sweeping. In any case it is central to maintain 
the invariant according to which a complete heap marking is followed by a complete 
sweeping, before the next marking can begin. 

On-demand sweeping consists in sweeping a page during mutation at page change 
time, just before allocation from it begins: such a choice is more cache-friendly than 
the traditional stop-the-world sweep, but it may leave some pages still to be swept 
when a collection begins: the non-deferred sweeping sub-phase, typically very short, 
serves to sweep such remaining pages. Non-deferred sweeping and stop-the-world 
sweeping share the exact same implementation. 

After collection all mutators are restarted with a second signal. 

Root marking Root marking is very simple, and currently sequential. Just like 
Boehm's collector in most of its configurations, it uses setjmpO for finding register 
roots in a portable way. 

Marking Given the atomicity of mark array stores parallel marking can easily 
proceed in parallel without synchronization, if we accept the possibility of some 
(statistically unlikely) duplicate work; our implementation is quite canonical and 
closely follows Boehm's one [14], with load balancing in the style of Taura and 
Yonezawa [28]. It should be noted that the BiBOP organization does not affect 
marking in any significant way. 
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Sweeping Parallel sweeping is even simpler, with pages dictating the natural gran- 
ularity for the operation of each thread: pages are simply taken from a list, swept 
and put back into another list. 



6.3.7 Synchronization 

One interesting and possibly original detail involves our locking style: in order 
to prevent a collection from starting during a critical section at mutation time, a 
global read-write lock is locked for reading at mutation, before acquiring the relevant 
mutex: the collection triggering function, before sending the signal, locks the same 
read- write lock for writing. 



6.3.8 Data density 

The system internally measures object size and alignment in machine words, and 
one word is the minimum size of a kinded object which can be represented without 
padding, in absence of alignment constraints specified by the user; with an alignment 
greater than one word, it becomes necessary in some cases to add some padding space 
right after the object payload; we call the effective size of an object the sum of its 
size and its alignment padding. 

Given a kind k of objects with alignment and size Sk, we define the effective 
size efc needed to store each object, and the corresponding data density dk, the 
number of objects representable per word, as: 



dk- — 
ek 



The definitions above intentionally disregard all the sources of memory over- 
head out of object slot arrays, including mark arrays and all garbage collector data 
structures, the rationale being that density is not meant as a measure of memory 
occupation, but rather as an index of the number of objects fitting in a cache line: 
as mark arrays and other collector data structures are mostly accessed at different 
times from the objects per se and reside in different cache lines, optimizing data 
density maximizes the amount of useful information stored in the physically limited 
cache space at mutation time. 

Data density may be reasonably defined in the same way independently from the 
garbage collecting strategy, and indeed it is of some interest to compare the values 
of dk in different memory management systems for two kinds which are widely 
employed in functional programs, the cons (two words) and the non-empty node of 
an Red-Black binary tree of one given color^^ (three words: left, datum and right). 
Neither kind has alignment requirements, hence aeons = CLnode = 1- 

Several systems such as the GNU libc mallocO facility [48], all the other alloca- 
tors derived from Doug Lea's mallocO and — even more interestingly — Boehm's 



^^The example trivially generalizes to AVL trees, the idea being simply that the balance-related 
information can usefully be represented as meta-data. rather than data. 
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collector [15], allocate all buffers at double- word-aligned addresses and may also 
add some internal status information to each buffer] metadata, when needed, must 
be represented as part of each object, adding to st- Instead many other systems, 
including just for example OCaml, do not force any alignment but always add one 
header word per object^^, sufficient to include a short tag, which again we consider 
part of Sfc. 

If metadata are accessed at runtime, as it is the case with dynamically-typed 
languages, with Boehm's collector we have dcons = d^ode = \- When metadata are 
not needed Boehm has optimal density in the cons case with dcons = but again 
dnode = |- III OCaml, with or without metadata, dcons = | and dnode = |- 

Independently of the need for metadata at runtime our model allows us to reach 
optimal density for both kinds, with dcons = 5 and dnode = 5- 

The data density of a particular representation seems likely to play a role in the 
overall efficiency of the system, even ignoring the cost of allocation and collection 
and considering only object accesses; anyway further empirical evidence will be 
needed to conffim this supposition for real world programs. 

6.3.9 Closures 

Functional programs written in certain styles^^ or CPS-transformed (§5.4.4.6) create 
a considerable number of short-lived closures at runtime. At a first look such a 
scenario does not seem to respect the functional hypothesis, as in principle closures 
can have many different shapes, depending on the number of non-locals captured 
in the environment, and on the fact that each non-local can be a pointer or a non- 
pointer. 

Even if allocating all closures as kindless objects would work, the overhead of such 
a simple-minded solution is in fact easy to avoid. 

First of all it should be observed that the great majority of functions need either 
zero or one variable in their non-local environment; it may be worth to add specific 
kinds for such common cases, and possibly also for the most performance-critical 
functions with larger non-local environments, when it is possible to recognize them 
with compile-time heuristics or after profiling. 

The number of needed kinds can be reduced by establishing a convention for ordering 
non-locals in their environment arrays, according to whether they are pointers or 
not: either ffi'st all pointers then all non-pointers, or vice- versa. 

The idea of normalizing the representation is a sort of pattern in the BiBOP 
scheme, generalizable to many other cases when using statically typed languages 
or £ personalities: there is no reason why two cases of different concrete types, 
possibly completely unconnected at a semantic label but with the same effective 

^^Some systems add even more than one header word per object. Sun's JDK, MMTk [11] and 
Microsoft's CLR, for example, use two words. 

■^^And in particular when using simple compilers or interpreters: higher-order code can be 
simplified with flow analysis. 
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size and number of potential pointer fields, cannot be represented in such a way to 
share the same kind. 

6.3.10 Lazy and object-oriented personalities 

Lazy languages require a slightly more sophisticated data representation than call- 
by-value languages, as in a realistic implementation it must be possible to destruc- 
tively update a still-unevaluated thunk, and replace it with the result at the end of 
its computation. 

Unsurprisingly, epsilongc does not provide any support for changing the kind of 
an existing object while maintaining its identity; that could be possible at collection 
time in a moving scheme, but not with mark-sweep^^. 

Any standard solution already employed by the collectors for lazy languages 
such as Haskell can be adopted: unfortunately some of the cleanness of the BiBOP 
model is lost in this dataneeds to be tagged with at least a boolean (two in 

a concurrent environment: objects may be thunks, in Bux or ready) recording the 
evaluation state of an object; any unused bit sequence in the payload or even the 
mark array entry of the object can do the job. 

Accessing possibly still-to-be-evaluated objects will often require a conditional at 
runtime, just like in conventional implementations of lazy languages; after an object 
is known to be ready, BiBOP metadata can be accessed just as for eager languages. 

Such a solution also necessarily requires some form of synchronization if the 
mutator threads are more than one: of course it is always possible to add a synchro- 
nization word in the payload, if needed. 

From this point of view the situation is not different for "managed" languages such 
as Java, where each object contains a header word reserved for that purpose; yet we 
believe that not forcing such an expensive representation for all objects is prefer- 
able in the general case; the user can always implement some additional logic where 
needed, out of the memory management system per se. 

For most runtimes there is no reason for keeping one mutex per object^ and even 
lazy languages such as Haskell normally employ strictness analysis to statically 
recognize many cases in which laziness is not needed, and more efficient traditional 
representations can be safely used. 

The work about Prolific Types [77] is relevant for object-oriented languages. 

6.4 Status 

epsilongc's implementation totals around 5000 lines of heavily commented C code, 

^^Iii a moving BiBOP collector otherwise similar to epsilongc it might be reasonable to split 
each kind into a evaluated kind, plus a thunk-or-evaluated one: all the alive evaluated objects 
of a thunk-or-evaluated kind would be re-kinded at collection time. This idea does not look 
particularly hard to implement, but keeping the collector both efficient and language-agnostic 
might be challenging. Moving-time hooks definable by the user would solve the problem, at some 
cost; the overhead could be reduced by allowing to re-compile the hooks as part of the collector, 
to be called as inline functions, like described for example in [95]. 
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quite easy to understand for being such a low-level concurrent piece of code not 
sparing C macros, #if def s, GCC function attributes and intrinsics in order to be 
support Autoconf options and be as general and efficient as possible. 

In preliminary micro-benchmarks (http://ageinghacker.net/publications/ 
gc-draft . pdf ) epsilongc appears to perform better than Boehm's collector; any- 
way no realistic parallel workload has been measured, and we feel that our conclu- 
sions can only be tentative in this respect. 

Our collector is not currently used by e: since the current implementation of e 
still relies on Guile for the s-expression frontend (§5.4.5) it also shares its memory 
management system, which has been a custom sequential mark-sweep garbage col- 
lector up to Guile 1.8.x, then replaced with Boehm's collector in the new series Guile 
2.0.x. We expect to integrate epsilongc into e as soon as we drop the dependency 
on Guile, or when we write a compiler — which can be done even while keeping the 
interactive system linked with Guile without re-implementing the frontend, at the 
cost of not having access to the frontend from compiled code. 

Support for mutexes and other imperative synchronization features is trivial to 
add to e's implementation with C primitives. 

Exploiting the BiBOP organization from ei does not appear particularly problem- 
atic. It will be interesting to test the benefits of the BiBOP strategy in code strongly 
based on sum-of-products (§5.4.4.3), such as in complex transforms; the necessary 
changes in the representation of sum tags do not seem very involved. 

epsilongc will work as it is with e, but in the longer term we plan to turn 
the current mark-sweep collector into the old generation of a generational system, 
where the younger generation is copying; this will be particularly relevant for the 
allocation patterns of CPS code, which tends to produce short-lived objects at a 
high rate. Implementing a collector which can be interrupted at any time by signals 
has been a fun and instructive challenge, but in the future we plan to seize the 
opportunity of coping with a moving collector in the young generation to introduce 
safe points, epsilongc also needs a couple of new functionalities, the most urgent 
of which are support for finalization and weak pointers. 

The epsilongc sources have been committed to the main e repository (see https: 
//savannah, gnu. org/bzr/?group=epsilon), as an independent subdirectory with 
its own build system. 

As the rest of the system it is free software, released under the GNU GPL version 
3 or later [31]. 

6.5 Summary 

We implemented epsilongc, a parallel mark-sweep conservative-pointer-finding garbage 
collector for multicore machines. Conceived for e, it is general enough to be used 
by other systems as well. 
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In order to exploit the memory hierarchies of modern machines, we pack data 
in a dense way without prefixing every object with a header, segregating objects by 
memory representation in a BiBOP organization. 

This solution is most appropriate for functional personalities, in which most 
objects belong to one of a small set of kinds. 



Conclusion 



We formally specified and implemented a practical extensible programming language 
based on a very small first-order imperative core, plus powerful syntactic abstraction 
features: Lisp-style macros map user s-expression syntax into expression data struc- 
tures; user-specified transforms permit arbitrary code-to-code transformation, with 
the intent of supporting extended syntactic features which are gradually "trans- 
formed away" into core forms. This open-ended approach enables research and 
experimentation. 

As examples of the power of our extension mechanisms, we used transforms to 
implement higher-order lexically-scoped anonymous procedures and first-class con- 
tinuations, on top of a core language only supporting named global procedures. 

The language is very expressive and permits reflection and self-modification; it is 
possible to update the global state of the system by global modifications, possibly 
up until a state where the program is "static", convenient for analysis and compilable 
with traditional techniques. 

We formally developed an analysis for static programs, and proved a soundness 
property about it with respect to the dynamic semantics. We argue that such formal 
reasoning is only possible thanks to the size and simplicity of the core language. 

The state of the system can be saved and restored with unexec and exec facilities 
based on marshalling. 

The language supports asynchronous threads and is suitable for modern multi- 
core machines. We implemented a parallel garbage collector, not yet integrated in 
the system, to limit garbage collection bottlenecks. 

The implementation is not mature yet, but can be played with. The bulk of 
the system is written in itself, using C for the runtime, and Guile as a temporary 
dependency for bootstrapping. 

An official part of the GNU project, epsilon is free software, released under the 
GNU GPL version 3 or later [31]. Its home page is http: //www.gnu. org/software/ 
epsilon. 

The source code is managed on a public bzr server, and a public mailing list 
is available for discussion: see https://savannah.gnu.org/projects/epsilon for 
more information. 
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Titre en franQais GNU epsilon — un langage de programmation extensible 



Resume en franQais Le reductionnisrae est une technique realiste de conception et im- 
plantation de vrais langagcs de programmation, et conduit a des solutions plus faciles a etendre, 
experimenter et analyser. 

Nous specifions formellement et implantons un laiigagc dc programmation extensible, base sur 
un langage-noyau minimaliste imperatif du premier ordre, equipe de mecanismes d'abstraction forts 
et avec des possibilites de reflexion et auto-modification. Le langage peut etre etendu a des niveaux 
tres hauts : en utilisant des macros a la Lisp et des transformations de code d code reecrivant les 
expressions etendues en expressions-noyau, nous definissons les clotures et les continuations de 
premiere classe au dessus du noyau. 

Les programmes qui nc s'auto-modificnt pas pcuvont etro analyses formellement, grace a la 
simplicite de la semantique. Nous developpons formellement un exemple d'' analyse statique et nous 
prouvons une propriete de soundness par apport a la semantique dynamique. 

Nous developpons un ramasse-miettes parallele qui convient aux machines multi-cceurs, pour 
permettre I'execution efHcace de programmes paralleles. 



Titre en anglais GNU epsilon — an extensible programming language 



Resume en anglais Reductionism is a viable strategy for designing and implementing prac- 
tical programming languages, leading to solutions which are easier to extend, experiment with and 
formally analyze. 

We formally specify and implement an extensible programming language, based on a mini- 
malistic first-order imperative core language plus strong abstraction mechanisms, reflection and 
self-modification features. The language can be extended to very high levels: by using Lisp-style 
macros and codc-to-codc transforms which automatically rewrite high-level expressions into core 
forms, we define closures and first-class continuations on top of the core. 

Non-self-modifying programs can be analyzed and formally reasoned upon, thanks to the lan- 
guage simple semantics. We formally develop a static analysis and prove a soundness property 
with respect to the dynamic semantics. 

We develop a parallel garbage collector suitable to multi-core machines to permit efficient ex- 
ecution of parallel programs. 
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